在文本列中搜索字符串并列出计数

时间:2015-02-17 08:08:39

标签: sql-server sql-server-2014-express

好的,我可能会问非常愚蠢的问题,但不知怎的,我无法找到执行以下操作的方法。

我有一个包含两列的表格,如下所示

+-------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| SL No |                                                                      Work                                                                       |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------+
|     1 | Identify Process Champs across all teams for BCUK processes                                                                                     |
|     2 | Impart short training on FMEA to all the Process Champs                                                                                         |
|     2 | List down all critical steps involved in the Process to ascertain the risk involved, feed the details back to FMEA template to analyze the risk |
|     3 | Prioritize the process steps based on Risk Priority Number                                                                                      |
|     4 | Identity the Process Gaps, suggest process improvement ideas to mitigate/mistake proof or reduce the risk involved in the process               |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------+

现在我有另一张表格,其中包含"关键词"如下所示

+-------+----------+
| Sl No |   Tags   |
+-------+----------+
|     1 | BCUK     |
|     2 | FMEA     |
|     3 | Priority |
|     4 | Process  |
+-------+----------+

现在我想"搜索字符串"在第一个表中基于"标签"在第二个表中并返回类似这样的内容

+----------+-------+
|   Tags   | Count |
+----------+-------+
| BCUK     |     1 |
| FMEA     |     2 |
| Priority |     1 |
| Process  |     8 |
+----------+-------+

"Process"关键字在整个表(第一个表)中跨多行显示eight times时,它将计数返回为8。

我正在使用SQL Server 2014 Express Edition

2 个答案:

答案 0 :(得分:2)

Adam Machanic对此类操作有一个函数GetSubstringCount。我根据你的需要修改了一下。有关详细信息:http://dataeducation.com/counting-occurrences-of-a-substring-within-a-string/

示例数据

CREATE TABLE MyTable(
    SLNo    INT,
    Work    VARCHAR(4000)
)
INSERT INTO MyTable VALUES
(1, 'Identify Process Champs across all teams for BCUK processes'),
(2, 'Impart short training on FMEA to all the Process Champs'),
(2, 'List down all critical steps involved in the Process to ascertain the risk involved, feed the details back to FMEA template to analyze the risk'),
(3, 'Prioritize the process steps based on Risk Priority Number'),
(4, 'Identity the Process Gaps, suggest process improvement ideas to mitigate/mistake proof or reduce the risk involved in the process');

CREATE TABLE KeyWord(
    SLNo    INT,
    Tag     VARCHAR(20)
)
INSERT INTO KeyWord VALUES
(1, 'BCUK'),
(2, 'FMEA'),
(3, 'Priority'),
(4, 'Process');

<强>解

;WITH E1(N) AS(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
,E2 AS(SELECT 1 AS N FROM E1 a, E1 b)
,E4 AS(SELECT 1 AS N FROM E2 a, E2 b)
,Tally(N) AS(
    SELECT TOP(11000) ROW_NUMBER() OVER(ORDER BY(SELECT NULL))FROM E4 a, e4 b
)
SELECT
    k.Tag,
    [Count] = SUM(x.cc)
FROM KeyWord k
CROSS JOIN MyTable m
CROSS APPLY(
    SELECT COUNT(*) AS cc
    FROM Tally
    WHERE
        SUBSTRING(m.Work, N, LEN(k.tag)) = k.tag
)x
GROUP BY k.tag

<强> RESULT

Tag                  Count
-------------------- -----------
BCUK                 1
FMEA                 2
Priority             1
Process              8

答案 1 :(得分:1)

不是计算匹配,而是用额外的字符替换它们,并将长度与原始长度进行比较。这样,计数非常简单快捷。

测试表和数据

DECLARE @texts table(SL_No int identity(1,1),Work varchar(max))

INSERT @texts VALUES
  ('Identify Process Champs across all teams for BCUK processes'),
  ('Impart short training on FMEA to all the Process Champs'),
  ('List down all critical steps involved in the Process to ascertain the risk involved, feed the details back to FMEA template to analyze the risk'),
  ('Prioritize the process steps based on Risk Priority Number'),
  ('Identity the Process Gaps, suggest process improvement ideas to mitigate/mistake proof or reduce the risk involved in the process')

DECLARE @searchvalues table(S1_No int identity(1,1),Tags varchar(max))

INSERT @searchvalues
VALUES('CUK'),('FMEA'),('Priority'),('Process')

查询:

SELECT 
  sum(len(replace(txt.work, sv.tags, sv.tags + '@')) - len(txt.work)) count, 
  tags
FROM 
  @texts txt
CROSS APPLY
  @searchvalues sv
WHERE charindex(sv.tag, txt.work) > 0
GROUP BY tags

结果:

count   tags
1   CUK
2   FMEA
1   Priority
8   Process