在Proc Sql中的多个步骤中分解多个Left连接

时间:2017-07-07 13:55:24

标签: sas proc-sql

我有一个代码,它使用了大量左表连接和许多表。当我运行此代码时,运行需要一个多小时,最后它会导致排序执行失败。所以,我正在考虑在多个步骤中分解左连接,但我不知道该怎么做并需要你的帮助。

代码如下:

Proc sql;
create table newlib.Final_test as 
SELECT 
POpener.Name as Client,
Popener.PartyId as Account_Number,
Case
  When BalLoc.ConvertedRefNo NE '' then BalLoc.ConvertedRefNo
else BalLoc.Ourreferencenum
End as LC_Number,
BalLoc.OurReferenceNum ,
BalLoc.CnvLiabilityCode as Liability_Code,
POfficer.PartyID as Officer_Num,
POfficer.Name as Officer_Name,
POpener.ExpenseCode,
BalLoc.IssueDate as Issue_Date format=mmddyy10.,
BalLoc.ExpirationDate AS Expiry format=mmddyy10.,
BalLoc.LiabilityAmountBase as Total_LC_Balance,
Case
  When BalLoc.Syndicated = 0 Then BalLoc.LiabilityAmountBase
    else 0
End as SunTrust_Non_Syndicated_Exposure,
Case 
  When BalLoc.Syndicated = 1 and BalLoc.PartOutGroupPkey NE 0 Then    
BalLoc.LiabilityAmountBase
    else 0
  End as SunTrust_Syndicated_Exposure,
Case 
  When BalLoc.Syndicated = 1 and BalLoc.PartOutGroupPkey NE 0 Then   
(BalLoc.LiabilityAmountBase - (BalLoc.LiabilityAmountBase *   
(PParty.ParticipationPercent/100)))
  Else BalLoc.LiabilityAmountBase 
End as SunTrust_Exposure,
Case
  When BalLoc.Syndicated = 1 and BalLoc.PartOutGroupPkey <> 0 Then   
(BalLoc.LiabilityAmountBase  * PParty.ParticipationPercent/100)
  Else 0
End as Exposure_Held_By_Other_Banks,
PBene.Name as Beneficiary_Trustee,
cat(put(input(POpener.ObligorNumber,best10.),z10.),put(input 

   (BalLoc.CommitmentNumber,best10.),Z10.)) as Key,
case
when BalLoc.BeneCusip2 NE ' ' then catx 
('|',Balloc.BeneCusip,Balloc.BeneCusip2)
else BalLoc.BeneCusip
End as Cusip,
Case 
  when balLoc.OKtoExpire = 1 then '0' 
  when balLOc.OKtoExpire=0 and BalLoc.AutoExtTermDays NE 0 then put  
(Balloc.AutoExtTermDays,z3.)
  when balLoc.OKtoExpire=0 and BalLoc.AutoExtTermsMonth NE 0 then put  
(balloc.AutoExtTermsMonth,z3.)
  else '000'
End as Evergreen
Case 
when blf.AnnualRate NE 0 then put(blf.AnnualRate,z7.)
when blf.Amount NE 0 then cats('F',put(blf.amount,z7.))
else 'WAIVE'
End as Pricing,

FROM BalLocPrimary BalLoc
Left JOIN Party POpener on POpener.Pkey = BalLoc.OpenerPkey
Left join PartGroup PGroup on BallOC.PartOutGroupPkey = PGroup.pKey
Left join PartParties PParty ON PGroup.pKey = PParty.PartGroupPkey and   
PParty.ParticipationPercent > 0 and
PParty.combined in
(select PPartParties.All_combined  
from PPartParties /*group by PartGroupPkey, PartyPkey*/)

Left Join MemExpenseCodes ExpCodes on POpener.ExpenseCode = ExpCodes.Code
Left JOIN Party PBene on PBene.Pkey = BalLoc.BenePkey
Left join Party POfficer on POfficer.Pkey = BalLoc.AccountOfficerPkey 
left join maxfee on maxfee.LocPrimaryPkey = BalLoc.LocPrimaryPkey
left join BalLocFee BLF on BLF.Pkey = maxfee.pkey
Where BalLoc.LetterType not in ('STBA','EXPA', 'FEE',' ') and 
 BalLoc.LiabilityAmountBase > 0 and BalLoc.irdb = 1
;
quit;

谢谢,

香卡

1 个答案:

答案 0 :(得分:0)

我建议的一些事情:

1,对于您引用的每个数据集,仅保留您需要加入的变量,或者在SELECT语句中使用的变量。例如,从您的Party dset中,您看起来只需要Pkey字段和Name。因此,当您加入该dset时,您应该使用:

Left JOIN Party(keep=Pkey Name) PBene on PBene.Pkey = BalLoc.BenePkey

2,将WHERE语句推送到FROM语句中,如下所示:

FROM BalLocPrimary(where=(LetterType not in ('STBA','EXPA', 'FEE',' ') and 
 LiabilityAmountBase > 0 and irdb = 1)) BalLoc

并确保条件的顺序最为最常见(除非可能在这3个字段中的任何索引)

3,您正在离开BalLocPrimary数据集,离开加入其他所有数据集。这是你真正想要的吗?没有客户端或Account_Number,结果集是否可以返回?左连接可能在计算上很昂贵,并且你可以将它们最小化越多越好。

4,Joe询问了关于连接字段的索引。你可能应该有一些。我发现自己定期引用this SUGI paper来为它添加书签。同样,您可以从查询中查看EXPLAIN PLAN,以查看它可能是瓶颈的位置。 Another SUGI paper将是一个良好的开端。

5,你是对的,这应该(应该?)分成多个步骤。这是一个很好的直觉。但是,最佳中断将取决于底层数据,索引和连接路径。因此很难从屏幕的另一侧开出这个规定。我认为我链接的第二篇论文可以为您提供一些有关特定案例优化的好建议。

相关问题