查询以递归方式获取所有引用的实体

时间:2017-07-26 06:50:21

标签: sql sql-server tsql recursion common-table-expression

我有一个由'Claims'组成的数据模型(为了使stackoverflow变得简单)只有一个OpenAmount字段。还有另外两个表,'ClaimCoupling'和'ClaimEntryReference'。

ClaimCoupling表直接引用到Claim表,而ClaimEntryReference实际上是可以通过多个声明预订的收到金额的预订(请参阅ClaimEntry_ID)。见这个图;

enter image description here

为了简单起见,我删除了所有金额,因为这不是我目前正在努力的目标。

我想要的是一个查询,它将启动@索赔表,并使用OpenAmount获取所有索赔,即<>然而,我希望能够打印出关于OpenAmount如何产生的准确报告,这意味着我还需要打印出与此声明相关的任何声明。为了使它更有趣同样的事情适用于预订,如果预订是在索赔X和索赔Y,只有X有一个未结金额,我想要获取X和Y所以我可以显示已预订的付款作为一个整体。

我试图通过递归CTE来做到这一点,但这(正确地)吹响了circulair参考文献。我想我会用一个简单的where语句解决这个问题,我会说只递归添加尚未成为CTE一部分的记录,但这是不允许的....

    WITH coupledClaims AS (
    --Get all unique combinations 
    SELECT cc.SubstractedFromClaim_ID AS Claim_ID,
           cc.AddedToClaim_ID AS Linked_Claim_ID FROM dbo.ClaimCoupling cc
    UNION
    SELECT cc.AddedToClaim_ID AS Claim_ID,
           cc.SubstractedFromClaim_ID AS Linked_Claim_ID FROM dbo.ClaimCoupling cc
),
MyClaims as
(
  SELECT * FROM Claim WHERE OpenAmount <> 0
  UNION ALL
  SELECT c.* FROM coupledClaims JOIN MyClaims mc ON coupledClaims.claim_id = mc.ID JOIN claim c ON c.ID = coupledClaims.linked_Claim_ID
  WHERE c.ID NOT IN (SELECT ID FROM MyClaims)
)
SELECT * FROM MyClaims

在解决了这个问题太长时间后,我决定用实际的循环来做... @@ Rowcount并简单地手动将它们添加到表变量中但是当我写这个解决方案时(我是确定我可以开始工作)我想我先问这里因为我不喜欢在TSQL中编写循环,因为我总觉得它很丑陋且效率低下。

请参阅以下sql Fiddle获取数据模型和一些测试数据(我注释掉了递归部分,否则我不允许创建链接);

http://sqlfiddle.com/#!6/129ad5/7/0

我希望这里的某个人有一个很好的方法来处理这个问题(很可能我在递归CTE上做错了)。完成后,这将在MS SQL 2016上完成。

1 个答案:

答案 0 :(得分:0)

所以这就是我迄今为止所学到的。感谢habo的评论,其中提到了以下问题; Infinite loop in CTE when parsing self-referencing table

首先,我决定至少“解决”问题。我的问题,写了一些手动递归,这解决了我的问题,但不是很漂亮&#39;因为我希望/思考的CTE解决方案更容易阅读以及执行手动递归解决方案。

手动递归

/****************************/
/* CLAIMS AND PAYMENT LOGIC */
/****************************/
DECLARE @rows as INT = 0
DECLARE @relevantClaimIds as Table(
Debtor_ID INT,
Claim_ID int
)
SET NOCOUNT ON

--Get anchor condition
INSERT INTO @relevantClaimIds (Debtor_ID, Claim_ID)
select Debtor_ID, ID
from Claim c
WHERE OpenAmount <> 0

--Do recursion
WHILE @rows <> (SELECT COUNT(*) FROM @relevantClaimIds)
BEGIN
set @rows = (SELECT COUNT(*) FROM @relevantClaimIds)

--Subtracted
INSERT @relevantClaimIds (Debtor_ID, Claim_ID)
SELECT DISTINCT c.Debtor_ID, c.id
FROM claim c
inner join claimcoupling cc on cc.SubstractedFromClaim_ID = c.ID
JOIN @relevantClaimIds rci on rci.Claim_ID = cc.AddedToClaim_ID
--might be multiple paths to this recursion so eliminate duplicates
left join @relevantClaimIds dup on dup.Claim_ID = c.id
WHERE dup.Claim_ID is null

--Added
INSERT @relevantClaimIds (Debtor_ID, Claim_ID)
SELECT DISTINCT c.Debtor_ID, c.id
FROM claim c
inner join claimcoupling cc on cc.AddedToClaim_ID = c.ID
JOIN @relevantClaimIds rci on rci.Claim_ID = cc.SubstractedFromClaim_ID
--might be multiple paths to this recursion so eliminate duplicates
left join @relevantClaimIds dup on dup.Claim_ID = c.id
WHERE dup.Claim_ID is null

--Payments
INSERT @relevantClaimIds (Debtor_ID, Claim_ID)
SELECT DISTINCT c.Debtor_ID, c.id
FROM @relevantClaimIds f
join ClaimEntryReference cer on f.Claim_ID = cer.Claim_ID
JOIN ClaimEntryReference cer_linked on cer.ClaimEntry_ID = cer_linked.ClaimEntry_ID AND cer.ID <> cer_linked.ID
JOIN Claim c on c.ID = cer_linked.Claim_ID
--might be multiple paths to this recursion so eliminate duplicates
left join @relevantClaimIds dup on dup.Claim_ID = c.id
WHERE dup.Claim_ID is null
END

然后在我收到并阅读评论后,我决定尝试CTE解决方案,看起来像这样;

CTE递归

with Tree as
        (
        select Debtor_ID, ID AS Claim_ID, CAST(ID AS VARCHAR(MAX)) AS levels
        from Claim c
        WHERE OpenAmount <> 0

        UNION ALL
        SELECT c.Debtor_ID, c.id, t.levels + ',' + CAST(c.ID AS VARCHAR(MAX)) AS levels
        FROM claim c
        inner join claimcoupling cc on cc.SubstractedFromClaim_ID = c.ID
        JOIN Tree t on t.Claim_ID = cc.AddedToClaim_ID
        WHERE (','+T.levels+',' not like '%,'+cast(c.ID as varchar(max))+',%')

        UNION ALL
        SELECT c.Debtor_ID, c.id, t.levels + ',' + CAST(c.ID AS VARCHAR(MAX)) AS levels
        FROM claim c
        inner join claimcoupling cc on cc.AddedToClaim_ID = c.ID
        JOIN Tree t on t.Claim_ID = cc.SubstractedFromClaim_ID
        WHERE (','+T.levels+',' not like '%,'+cast(c.ID as varchar(max))+',%')

        UNION ALL
        SELECT c.Debtor_ID, c.id, t.levels + ',' + CAST(c.ID AS VARCHAR(MAX)) AS levels
        FROM Tree t
        join ClaimEntryReference cer on t.Claim_ID = cer.Claim_ID
        JOIN ClaimEntryReference cer_linked on cer.ClaimEntry_ID = cer_linked.ClaimEntry_ID AND cer.ID <> cer_linked.ID
        JOIN Claim c on c.ID = cer_linked.Claim_ID
        WHERE (','+T.levels+',' not like '%,'+cast(c.ID as varchar(max))+',%')
        )
select  DISTINCT Tree.Debtor_ID, Tree.Claim_ID
from Tree

这个解决方案确实很短'&#39;眼睛更容易,但实际上表现更好吗?

效果差异

手册; CPU 16,读取1793,持续时间13

CTE; CPU 47,读取4001,持续时间48

<强>结论

不确定它是否是由于CTE解决方案中所需的varchar强制转换,或者它是否必须在完成它的递归之前进行一次额外的迭代,但它实际上需要更多的资源。手动递归。

最终有可能用CTE然而看起来并不是一切(感谢上帝;-))性能明智地坚持手动递归似乎是一条更好的路线。