在SQL表条目中查找重复单词

时间:2018-11-01 06:40:28

标签: sql sql-server tsql

我有一个名为“ EntityName”和“ entityid”的列。

 Entityid       EntityName
    1234        ABC inch EFG inch
    3456        inch* aaa inch vvv

任何人都可以给我查询以找到这些重复单词的类型。

2 个答案:

答案 0 :(得分:2)

如果您使用SQL Server 2017,则可以使用STRING_SPLIT尝试以下查询:

CREATE TABLE #TestData(Entityid int,Situation varchar(100))

INSERT #TestData(Entityid,Situation)VALUES
(1234,'ABC inch EFG inch'),
(3456,'inch aaa inch vvv'),
(7890,'BBBB aaa inch vvv')

SELECT *
FROM #TestData d
WHERE EXISTS(SELECT value FROM STRING_SPLIT(d.Situation,' ') WHERE value<>N'' GROUP BY value HAVING COUNT(*)>1)

DROP TABLE #TestData

您可以显示计数:

CREATE TABLE #TestData(Entityid int,Situation varchar(100))

INSERT #TestData(Entityid,Situation)VALUES
(1234,'ABC inch EFG inch'),
(3456,'inch aaa inch vvv aaa aaa'),
(7890,'BBBB aaa inch vvv')

SELECT
  *,
  (
    SELECT STRING_AGG(CONCAT(value,'*',cnt),', ')
    FROM
      (
        SELECT value,COUNT(*) cnt FROM STRING_SPLIT(d.Situation,' ') WHERE value<>N'' GROUP BY value HAVING COUNT(*)>1
      ) q
  ) DuplicatedWords
FROM #TestData d
WHERE EXISTS(SELECT value FROM STRING_SPLIT(d.Situation,' ') WHERE value<>N'' GROUP BY value HAVING COUNT(*)>1)

DROP TABLE #TestData

结果:

Entityid    Situation                    DuplicatedWords
1234        ABC inch EFG inch            inch*2
3456        inch aaa inch vvv aaa aaa    aaa*3, inch*2

答案 1 :(得分:2)

您可以尝试以下操作:

DECLARE @DataSource TABLE
(   
    [EntityID] INT
   ,[Situation] VARCHAR(MAX)
);

INSERT INTO @DataSource ([EntityID], [Situation])
VALUES (1234, 'ABC inch EFG inch')
      ,(3456, 'inch aaa inch vvv')
      ,(1, 'only one inch');

DECLARE @Search VARCHAR(12) = 'inch';

SELECT *
FROM @DataSource
WHERE CHARINDEX(@Search, [Situation]) > 0
    AND CHARINDEX(@Search, STUFF([Situation], CHARINDEX(@Search, [Situation]), LEN(@Search), '')) > 0;

这个想法是要检查您的单词是否匹配,然后替换它并检查是否匹配。

当然,这是非常简单的匹配。如果实现SQL CLR函数以在T-SQL上下文中获得正则表达式支持,则可以添加更复杂的条件。