通过与字符集进行比较来替换文本中的关键字

时间:2018-10-12 06:11:52

标签: sql sql-server tsql

我的sql表中有一个Nvarchar列。我想通过将它与一组预定义字符进行比较来用***替换/屏蔽其中的某些单词。

这意味着如果我的专栏包含预定义单词列表中的单词,那么我想屏蔽它们。预定义的单词列表约为400个单词。

DDL和DML示例:

DECLARE @Comments TABLE (id INT, comment VARCHAR(50))
INSERT INTO @Comments VALUES (1, 'Bob is a sales person'), (2, 'Shane is a nice guy')
DECLARE @People TABLE (personname VARCHAR(50))
INSERT INTO @People VALUES ('Bob'), ('Mark'), ('Shane')

样本数据:

+----+-----------------------+
| id |        comment        |
+----+-----------------------+
|  1 | Bob is a sales person |
|  2 | Shane is a nice guy   |
+----+-----------------------+

预定义词:

+------------+
| personname |
+------------+
| Bob        |
| Mark       |
| Shane      |
+------------+

注意:

  • 请参阅附件中的图像以获取示例数据。
  • 如果注释列中包含人名,我想用***替换/屏蔽 预定义的单词

2 个答案:

答案 0 :(得分:1)

如果您有鲍勃和鲍比怎么办?您不能先替换Bob,否则最终将得到*** by。这是一种以相反的长度顺序替换字符串的方法。这是一个开始。

不过,如果鲍勃是雪橇,那么您最终会得到***是***雪橇。

DECLARE @Comments TABLE (id INT, comment VARCHAR(50))
INSERT INTO @Comments VALUES (1, 'Bob is a sales person'), (2, 'Shane is a nice guy')
DECLARE @People TABLE (personname VARCHAR(50))
INSERT INTO @People VALUES ('Bob'), ('Mark'), ('Shane')

DECLARE @curname VARCHAR(50)
DECLARE cur CURSOR FOR   
SELECT personname FROM @people ORDER BY LEN(personname) DESC, personname desc

OPEN cur  

FETCH NEXT FROM cur   
INTO @curname

WHILE @@FETCH_STATUS = 0  
BEGIN       
    UPDATE @Comments SET comment = REPLACE(comment, @curname, REPLICATE('*', LEN(@curname)))

    FETCH NEXT FROM cur   
    INTO @curname
END   
CLOSE cur;  
DEALLOCATE cur;  

输出:

SELECT * FROM @Comments

id  comment
1   *** is a sales person
2   ***** is a nice guy

几个快速mods:

1)替换区分大小写的内容:

UPDATE @Comments SET comment = REPLACE(comment COLLATE Latin1_General_CS_AS, @curname, REPLICATE('*', LEN(@curname)))

2)仅替换第一个实例:

UPDATE @Comments SET comment = CASE WHEN CHARINDEX(@curname, comment) > 0   
    THEN STUFF(comment, CHARINDEX(@curname, comment), LEN(@curname), REPLICATE('*', LEN(@curname)))
    ELSE comment
    END  

答案 1 :(得分:0)

从SQL Server 2017开始,您可以结合使用STRING_AGG(更多信息here)和STRING_SPLIT将句子分解成单词,用***替换名称,然后聚合回来单词成句子:

select co.id, 
      string_agg(case when p.personname is null then c.value else '***' end,' ') as comment
from @Comments co
cross apply string_split(comment,' ') c
left join @People p on p.personname = c.value
group by co.id

结果:

enter image description here

如果单词之间有空格以外的其他字符,则可以在分割之前用空格替换它们:

DECLARE @Comments TABLE (id INT, comment VARCHAR(50))
INSERT INTO @Comments VALUES
  (1, 'Bob (is a sales person)')
, (2, '(Shane:is-a-nice-guy)')
, (3, 'This.is(Mark),he.is-a-nice-guy,too')
DECLARE @People TABLE (personname VARCHAR(50))
INSERT INTO @People VALUES ('Bob'), ('Mark'), ('Shane')

select co.id, 
      trim(string_agg(case when p.personname is null then c.value else '***' end,' ')) as comment
from
(
select id, replace(replace(replace(replace(replace(replace(comment,':',' '),')',' '),'(',' '),',',' '),'.',' '),'-',' ') as comment 
from @Comments
) co
cross apply string_split(comment,' ') c
left join @People p on p.personname = c.value
group by co.id

现在有了这种输入:

enter image description here

您将获得以下输出:

enter image description here