如何创建SSIS导入原始文件包?

时间:2011-11-07 16:14:41

标签: tsql ssis

我是SSIS的新手。我正在使用SSIS 2008.我发现有许多工具可以执行与某些SQL运算符相同的功能。我什么时候应该使用SSIS工具而不是TSQL操作符?此外,有关更有效的解决方案的建议吗?

以下是我从SSIS导入/导出向导中选择的tsql查询。因此,我当前的解决方案不使用除一个数据流源和一个数据流目标之外的任何SSIS工具。

SELECT
   en.uniqueid_c AS enrollment_id,
   CONVERT(nvarchar (20),c.clientcode_c) AS client_id, --Legacy CDT# (note this has to be the same value as client_id on the other tables)
   CONVERT(nvarchar (20),CASE --Program codes included here for enrollment data (Excludes enrollments with modifiers)
                         WHEN en.agency_c = 'ADO' THEN 'ADO'
                         WHEN en.agency_c = 'ADOT' THEN 'ADO'
                         WHEN en.agency_c in ('MRDD/IHS','MRDD/PSH','MRDD/REP','MRDD/RTC') THEN 'CHOICES'
                         WHEN en.agency_c = 'FPP' THEN 'CM' 
                         WHEN en.agency_c = 'CMGT' THEN 'CM'
                         WHEN en.agency_c = 'EDU' THEN 'EDU'
                         WHEN en.agency_c = 'COM' THEN 'GH'
                         WHEN en.agency_c = 'CSUP' THEN 'INT'
                         WHEN en.agency_c = 'IHS' THEN 'INT'
                         WHEN en.agency_c = 'IHST' THEN 'INT'
                         WHEN en.agency_c = 'MST' THEN 'MST'
                          WHEN en.agency_c = 'OMHS' THEN 'OMHS'
                        WHEN en.agency_c = 'ORTC' THEN 'RESA' 
                         WHEN en.agency_c = 'MRTC' THEN 'RESA' 
                         WHEN en.agency_c = 'RTC' THEN 'RESA'
                         WHEN en.agency_c = 'RFC' THEN 'RFC'
                         WHEN en.agency_c in ('SCCR','SCMN','SCRP') THEN 'SCS'
                         WHEN en.agency_c = 'SUB' THEN  'SUB' --uncertain about this one KMH - 06/23/10
                         WHEN en.agency_c = 'STFC' THEN 'TFC'
                         WHEN en.agency_c = 'MTFC' THEN 'TFC' 
                         WHEN en.agency_c = 'TFC' THEN 'TFC'
                         WHEN en.agency_c = 'TL' THEN 'TL'
                         WHEN en.agency_c = 'TLT' THEN 'TL'
                         ELSE en.agency_c
                         END) AS program_code,
-------------------------------------------------------------------------------------------------------------------------------------------                      
   --2nd Program_code entry handles program_modifier_code.
   --The codes need to be grouped and cased out to match the Evolv codes. --This was fixed.
   -- NOTE!!!! The codes below will need to be replaced with the finance modifiers just for TN. --per deneen and diane.
   UPPER(CONVERT(nvarchar(20), CASE --PROGRAM MODIFIERS --This is pulled into the program_code in the 2nd run. These should exclude non-modifiers.
                         WHEN en.agency_c in ('ADO','ADOT','IHST','TLT') THEN 'TRANS'
                         WHEN en.agency_c in ('CHOM','CGMT','COM','CSUP','EDU','IHS','LHS','MRTC','MST','MTFC','RTC', --use this to exclude recs
                                              'SCCR','SCRP','SUB','TFC','TL','ZADMIN','ZAWOL','ZDET','ZHOSP') THEN en.agency_c
                         WHEN en.agency_c = 'FPP' THEN 'FPP' 
                         WHEN en.agency_c = 'RFC' THEN 'RFC'
                         WHEN en.agency_c = 'TFC' THEN 'TFC'                    
                         WHEN en.agency_c = 'MRDD/IHS' THEN 'MRIHS'
                         WHEN en.agency_c = 'MRDD/PSH' THEN 'MRPSH'
                         WHEN en.agency_c = 'MRDD/REP' THEN 'MRREP'
                         WHEN en.agency_c = 'MRDD/RTC' THEN 'MRRTC'
                         WHEN en.agency_c = 'PD40' THEN 'PD40'
                         WHEN en.agency_c = 'PDET' THEN 'PDET'
                         WHEN en.agency_c = 'PINT' THEN 'PINT'
                         WHEN en.agency_c = 'PLV4' THEN 'PLV4'
                         WHEN en.agency_c = 'PWIL' THEN 'PWIL'
                         WHEN en.agency_c = 'PYDC' THEN 'PYDC'
                         WHEN en.agency_c = 'SCMN' THEN 'SCMN'
                         WHEN en.agency_c = 'STFC' THEN 'STFC'
                         ELSE en.agency_c                      
                         END)) AS program_modifier_code,
 -------------------------------------------------------------------------------------------------------------------------------------------  
 /*
 Group Homes and Inner Harbour locations were added on 7/26/10 - KMH
 */                   
   CONVERT(nvarchar(20),
   CASE
       WHEN en.location_c = 'ANNI' THEN 'AL-ANNI' 
       WHEN en.location_c = 'ASHE' THEN 'NC-ASHE'
       WHEN en.location_c = 'ATL' THEN 'GA-ATL'
       WHEN en.location_c = 'RMBT' THEN 'RTC-TN-BC'       
       WHEN en.location_c = 'BIL' THEN 'MS-BIL'
       WHEN en.location_c = 'BIRM' THEN 'AL-BIRM'
       WHEN en.location_c = 'BOST' THEN 'MA-BOST'
       WHEN en.location_c = 'CIRT' THEN 'RTC-TN-CIRT'
       WHEN en.location_c = 'CHAR' THEN 'NC-CHAR'
       WHEN en.location_c = 'CHAT' THEN 'TN-CHAT'
       WHEN en.location_c = 'CLAR' THEN 'TN-CHAR'
       WHEN en.location_c = 'COL' THEN 'TN-COL'
       WHEN en.location_c = 'CMS' THEN 'MS-COL'
       WHEN en.location_c = 'CCRD' THEN 'NC-CCRD'
       WHEN en.location_c = 'COOK' THEN 'TN-COOK'
       WHEN en.location_c = 'DAL' THEN 'TX-DAL'
       WHEN en.location_c = 'RDV' THEN 'RTC-TN-DV'
       WHEN en.location_c = 'DKSN' THEN 'TN-DKSN'
       WHEN en.location_c = 'RDW' THEN 'RTC-TN-DW'
       WHEN en.location_c = 'DOTH' THEN 'AL-DOTH'
       WHEN en.location_c = 'DUR' THEN 'NC-DURH'
       WHEN en.location_c = 'DYER' THEN 'TN-DYER'
       WHEN en.location_c = 'FAYE' THEN 'NC-FAYE'
       WHEN en.location_c = 'GCRT' THEN 'RTC-TN-GCRT'
       WHEN en.location_c = 'GRNB' THEN 'NC-GRNB'
       WHEN en.location_c = 'GRNV' THEN 'NC-GRNV'
       WHEN en.location_c = 'HMS' THEN 'MS-HMS'
       WHEN en.location_c = 'DMS' THEN 'MD-DMS'
       WHEN en.location_c = 'HICK' THEN 'NC-HICK'
       WHEN en.location_c = 'HILL' THEN 'NC-HILL'
       WHEN en.location_c = 'HUNT' THEN 'AL-HUNT'
       WHEN en.location_c = 'INNH' THEN 'RTC-GA-INNH'
       WHEN en.location_c = 'JMS' THEN 'MS-JMS'
       WHEN en.location_c = 'JTN' THEN 'TN-JTN'
       WHEN en.location_c = 'JCTN' THEN 'TN-JCTN'
       WHEN en.location_c = 'KNOX' THEN 'TN-KNOX'
       WHEN en.location_c = 'LAKE' THEN 'FL-LAKE'
       WHEN en.location_c = 'LAWR' THEN 'MA-LAWR'
       WHEN en.location_c = 'MANC' THEN 'NH-MANC'
       WHEN en.location_c = 'MCB' THEN 'MS-MCC'
       WHEN en.location_c = 'MEM' THEN 'TN-MEM'
       WHEN en.location_c = 'MMS' THEN 'MS-MMS'
       WHEN en.location_c = 'MIAM' THEN 'FL-MIAM'
       WHEN en.location_c = 'MIDM' THEN 'TN-MIDM'
       WHEN en.location_c = 'MOBI' THEN 'AL-MOBI'
       WHEN en.location_c = 'MONT' THEN 'AL-MONT'
       WHEN en.location_c = 'MRSN' THEN 'TN-MRSN'
       WHEN en.location_c = 'NASH' THEN 'TN-NASH'
       WHEN en.location_c = 'OCAL' THEN 'FL-OCAL'
       WHEN en.location_c = 'PAR' THEN 'TN-PAR'
       WHEN en.location_c = 'PINE' THEN 'NC-PINE'
       WHEN en.location_c = 'ROAN' THEN 'VA-ROAN'
       WHEN en.location_c = 'SPRG' THEN 'MA-SPRI/HOLY'
       WHEN en.location_c = 'PETE' THEN 'FL-STPET'
       WHEN en.location_c = 'TAMP' THEN 'FL-TAMP'
       WHEN en.location_c = 'TUP' THEN 'MS-TUP'
       WHEN en.location_c = 'WDC' THEN 'DC-WDC'
       WHEN en.location_c = 'WILM' THEN 'NC-WILM'
       WHEN en.location_c = 'WBRN' THEN 'MA-WBRN'
       WHEN en.location_c = 'WORC' THEN 'MA-WORC'
       WHEN en.location_c = 'GM' THEN 'TN-GM'
       WHEN en.location_c = 'GN' THEN 'TN-GN'
   ELSE en.location_c
   END)
       as service_facility_code,
   en.startdate_d AS start_date,
   en.enddate_d AS end_date,
   c.refdate_d AS referral_date,
   ep.enddate_d AS overall_discharge_date, --Episode end date
   CONVERT(nvarchar(20),c.altclientcode_vc) AS org_id,-- TNKIDS#
   UPPER(CONVERT(nvarchar(50), CASE
                         WHEN en.enddate_d = ep.enddate_d THEN ep.accountnumber_vc
                         WHEN en.enddate_d < ep.enddate_d THEN 'TWA'
                         END)) AS discharged_to_type,
   UPPER(CONVERT (nvarchar(20), CASE
                         WHEN ep.accountnumber_vc in ('DORM','INDEP/SUP','INDEP/SELF','INDEP/NR','INDEP/FR') THEN 07
                         WHEN ep.accountnumber_vc in ('JAIL','DET') THEN 01
                         WHEN ep.accountnumber_vc in ('BIOL') THEN 02
                         WHEN ep.accountnumber_vc in ('ADOPT/DCS','ADOPT/PAR','ADOPT/YV') THEN 06
                         WHEN ep.accountnumber_vc in ('REL') THEN 03
                         WHEN ep.accountnumber_vc in ('PSYCH','EMER','RTC') THEN 04
                         ELSE 99
                         END)) AS discharged_to_type_code,
   CONVERT(nvarchar(300),'cd.enrollments') AS original_table_name,
   CONVERT(nvarchar (400), en.alerts_vc) AS remarks,
   CONVERT(varchar(50),  CASE 
                         WHEN en.disreason_c = 'ADMI' THEN 'Administrative'
                         WHEN en.disreason_c = 'AMA' THEN 'Against Medical Advice'
                         WHEN en.disreason_c = 'AWOL' THEN 'Absent Without Leave'
                         WHEN en.disreason_c = 'DCSD' THEN 'Deceased'
                         WHEN en.disreason_c = 'JC' THEN 'Juvenille Court'
                         WHEN en.disreason_c = 'NP' THEN 'No Progress'
                         WHEN en.disreason_c = 'TMED' THEN 'Transfer to Medical Treatment Facility'
                         WHEN en.disreason_c = 'TPSY' THEN 'Transfer to Inpatient Psychiatric Facility'
                         WHEN en.disreason_c = 'TW' THEN 'Transfer within Agency'
                         WHEN en.disreason_c = 'WMA' THEN 'With Medical Advice'
                         ELSE 'Other'
                         END)AS outcome,
   CONVERT(varchar(5),   CASE
                         WHEN en.disreason_c in ('ADMI','AMA','AWOL','NP') THEN 'CBT'
                         WHEN en.disreason_c in ('DCSD','WMA') THEN 'DLR'
                         WHEN en.disreason_c in ('JC') THEN 'RSF'
                         WHEN en.disreason_c in ('TMED','TPSY') THEN 'DMR'
                         WHEN en.disreason_c in ('TW') THEN 'RPA'
                         ELSE 'CBT' 
                         END) AS outcome_code,
--Populate service_facility_unit table and add case statement for loading CDT program_c into client_enrollment room_number 7/27/10 KMH
   UPPER(CONVERT(varchar(10),  CASE
                         WHEN en.program_c = 'BT1L' THEN 'BC1L'
                         WHEN en.program_c = 'BT1R' THEN 'BC1R'
                         WHEN en.program_c = 'BT2L' THEN 'BC2L'
                         WHEN en.program_c = 'BT2R' THEN 'BC2R'
                         WHEN en.program_c = 'BT3' THEN 'BC3'
                         WHEN en.program_c = 'BT3L' THEN 'BC3L'
                         WHEN en.program_c = 'BT3R' THEN 'BC3R'
                         WHEN en.program_c = 'BT4L' THEN 'BC4L'
                         WHEN en.program_c = 'BT4R' THEN 'BC4R'
                         WHEN en.program_c = 'BT5' THEN 'BC5'
                         WHEN en.program_c = 'BT6' THEN 'BC6'
                         WHEN en.program_c = 'CRT1' and en.location_c = 'CIRT' THEN 'BCRT1'
                         WHEN en.program_c = 'CRT2' and en.location_c = 'CIRT' THEN 'BCRT2'
                         WHEN en.program_c = 'CRT3' and en.location_c = 'CIRT' THEN 'BCRT3'
                         WHEN en.program_c = 'CRT4' and en.location_c = 'CIRT' THEN 'BCRT4'
                         WHEN en.program_c = 'CRT1' and en.location_c = 'GCRT' THEN 'GCRT1'
                         WHEN en.program_c = 'CRT2' and en.location_c = 'GCRT' THEN 'GCRT2'
                         WHEN en.program_c = 'CRT3' and en.location_c = 'GCRT' THEN 'GCRT3'
                         WHEN en.program_c = 'CRT4' and en.location_c = 'GCRT' THEN 'GCRT4'
                         WHEN en.program_c = 'DVC' THEN 'DV1'
                         WHEN en.program_c = 'DVM' THEN 'DV2'
                         WHEN en.program_c = 'DVN' THEN 'DV3'
                         WHEN en.program_c = 'DVP' THEN 'DV4'
                         WHEN en.program_c in ('DW1','DW2','DW3','DW4','DW5','DW6','DW7','DW8') and en.location_c = 'RDW' THEN en.program_c
                         WHEN en.program_c in ('IH01','IH02','IH03','IH04','IH05','IH06','IH07') and en.location_c = 'INNH' THEN 'IH3'
                         WHEN en.program_c = 'IH08' and en.location_c = 'INNH' THEN 'IH1'
                         WHEN en.program_c = 'IH09' and en.location_c = 'INNH' THEN 'IH2'
                         ELSE 'NA'                       
                         END)) as room_number

FROM
   ar.client c 
   INNER JOIN cd.enrollments en ON (c.uniqueid_c = en.clientid_c)
   INNER JOIN cd.episode ep ON (ep.uniqueid_c = en.episodeid_c and ep.clientid_c = c.uniqueid_c)

WHERE
   (ep.enddate_d is NULL OR ep.enddate_d >= getdate()-730) and
   en.location_c in (select code from dbo.yv_LKUP_OfficeLocation where state in ('TX', 'FL'))
order by 2

1 个答案:

答案 0 :(得分:2)

等待澄清需求时的通用SSIS建议。

何时应使用SSIS工具与TSQL运算符

人们经常试图使用开箱即用的转换,因为它看起来是正确的事情。在下拉列表中选择表,添加排序,添加另一个数据源,对其进行排序,合并连接,可能是聚合。

当问题域很小时,比如几十到几十万,处理的差异可以忽略不计。如果一个软件包在2分钟而不是1分钟内运行,或者在处理期间消耗80%的服务器内存而40%,则人们可能不会注意到。

当数据量达到临界点时,糟糕的包装设计决定会吃掉你的午餐。

排序

当您的源RDBMS请求排序数据时,数据库中可能存在聚簇索引或某些内容,可以节省实际排序数据的时间。当SSIS收到对已排序数据的请求时,您将为该操作支付多次费用。

SSIS中的排序是完全阻塞的异步操作。这意味着流经该点的所有数据必须到达该转换,在可以向下游发送之前进行操作。有大量的行或者是一个非常慢的源,当它遇到其中一个操作时你会真正注意到它。也许你说,我可以等,因为我真的需要对数据进行排序,但时间并不是你花费的唯一资源。由于异步转换需要将数据从一个缓冲区复制到另一个缓冲区,因此您的内存需求也会增加一倍。

也许您仍然接受使用OOB项目的成本时间和内存使用情况,但您可能还没有完成付款。您的服务器有32GB内存,SSIS可以全部使用它们。每行花费一千字节,并且您的数据流中有16M行数据。它点击排序,数据开始堆积。一旦最后一行到达,您就为原始数据消耗了16GB内存。排序操作开始排序,它将16GB复制到另一个16GB的内存和oops,SSIS内存不足。您现在支付临时文件存储的第三个价格。当执行引擎处于内存压力下时,它最终将开始分页到磁盘。一旦发生这种情况,如果您关心表现,游戏就结束了,但您的痛苦可能并非如此。如果尚未为每个数据流设置BlobTempStoragePath值,则该文件将写入默认的临时存储位置,该位置可能是C:\ something或其他。您的系统管理员切出一个非常精简的C分区,因为只有操作系统在那里进行操作,因此突然写入该驱动器的16GB交换文件占用了所有可用空间,然后操作系统将变得不快,程序包失败并且指责开始。 不是我曾经去过那里

故事的道德

尽可能地完成源系统中的所有操作。上述方案适用于排序,但该课程适用于所有“共享”操作员。

有关更有效解决方案的建议吗?

至于如何清理查询,这些映射会让我疯狂。您是否有机会创建N个查找表(或内联表值函数)以提供存储值和显示值之间的映射?然后,您可以抽象出所有案例逻辑。

参考

最后,这篇文章中的数字惊人地依赖于硬件和工作负载