选择vs表值函数join中的Oracle Scalar函数

时间:2011-07-20 20:51:22

标签: oracle plsql performance

我的查询效果不佳。查询的一个方面是在表值函数上使用交叉连接,老实说,我正在模仿我在函数上使用CROSS APPLY的TSQL行为,以避免使用标量函数调用。这在Oracle中是不好的行为吗?

我遇到的主要问题是Oracle Tuning Advisor不会解析我的查询,所以我还无法研究索引优化。通常我不会发布这么多代码,但我怀疑这是我的查询而不是表格优化可能导致问题。

统计数据表实际上是唯一一个volumn超过4,000,000条记录的表。任何人都可以建议删除明显的恶意Oracle行为吗?或者,如果所有看起来都是一个很好的工具来获得一些索引调整咨询Oracle Enterprise Manager不会解析此查询以提供任何建议。

从跟踪和格式化中捕获的其他性能信息 通过TKPROF

  

解析:计数(1)| CPU(0.04)|经过(0.04)|磁盘(0)|查询(852)|当前(0)|行(0)

     

执行:计数(1)| CPU(0.00)|经过(0.00)|磁盘(0)|查询(0)|当前(0)|行(0)

     

获取:Count(1)| CPU(9.64)|经过(14.50)|磁盘(34578)|查询(35610)|当前(4)|行(4)

     

解析期间库缓存中未命中:1   优化器模式:ALL_ROWS   解析用户ID:1165

     

行行源操作

     
  4  HASH JOIN OUTER (cr=38069 pr=34578 pw=0 time=19208475 us)
  2   COLLECTION ITERATOR PICKLER FETCH REPORT_INTERVAL_SEQUENCE_UDF (cr=97 pr=0 >                                                                    pw=0 time=13766 us)
  4   VIEW  (cr=37972 pr=34578 pw=0 time=19194353 us)
  4    HASH GROUP BY (cr=37972 pr=34578 pw=0 time=19194329 us)  
     

60650 FILTER(cr = 37972 pr = 34578 pw = 0 time = 19673947 us)
    60650 NESTED LOOPS(cr = 37972 pr = 34578 pw = 0 time = 19431329 us)
    60650 HASH JOIN(cr = 37941 pr = 34578 pw = 0 time = 5294908 us)         4收集ITERATOR PICKLER FETCH REPORT_MACHINEINFO_GETT_UDF(cr = 2331 pr = 0 pw = 0 time = 212033 us)
    60650表访问全部ELS_STATISTIC_ENTRY(cr = 35610 pr = 34578 pw = 0                                                                      时间= 4416705我们)
     60650 COLLECTION ITERATOR PICKLER FETCH REPORT_INTERVAL_GETT_UDF(cr = 31 pr = 0> pw = 0 time = 13372794 us)

SELECT
         TimeInterval,
         stats.During,
         stats.Name,
         stats.cnt
    FROM
        TABLE (GET_INTERVAL_SEQUENCE_UDF(
                                         TO_TIMESTAMP ('07/15/2011','mm/dd/yyyy')
                                        ,TO_TIMESTAMP ('07/20/2011','mm/dd/yyyy')
                                        ,2)) dtRange
    LEFT JOIN
    (
         SELECT
              i.During
              , mi.Name
              , SUM (CAST (VALUE_NUMERIC AS INT)) cnt

         FROM
              statistics se
         JOIN TABLE (Get_Context_Info_udf ()) mi 
              ON (se.Context_ID = mi.Context_ID)
         CROSS JOIN TABLE (Interval_GetT (se.EntryDate, 2)) i
         WHERE
              StatisticTypeID = HEXTORAW ('6CF933B091AE46FEA7F56BE96308190F') 
              AND EntryDate < TO_TIMESTAMP ('07/20/2011','mm/dd/yyyy') 
              AND EntryDate > TO_TIMESTAMP ('07/15/2011', 'mm/dd/yyyy')
         GROUP BY
             i.During
             , mi.Name
    ) stats ON dtRange.TimeInterval = stats.TimeInterval


The following are for reference in the aforementioned query.


CREATE OR REPLACE FUNCTION Interval_GetT(datestamp IN timestamp,  timeInterval IN int) 
RETURN TReportIntervalList AS vResult TReportIntervalList;
BEGIN
     SELECT TReportInterval(
                            CASE timeInterval 
                            WHEN 1 THEN TO_CHAR(datestamp, 'YYYY-MM-DD HH24') 
                            WHEN 2 THEN TO_CHAR(datestamp, 'YYYY-MM-DD')
                            WHEN 3 THEN TO_CHAR(datestamp, 'YYYY-WW')
                            END
                           ) 
     BULK COLLECT INTO vResult                                       
     FROM Dual WHERE ROWNUM = 1;

     RETURN vResult;
END;



CREATE OR REPLACE FUNCTION GET_INTERVAL_SEQUENCE_UDF(
      startTime IN timestamp,
      endTime IN timestamp,
      inputInterval IN int)
      RETURN t_interval_list_table   AS  intervalList t_interval_list_table := t_interval_list_table();
    BEGIN

    SELECT 
         CASE inputInterval
         WHEN 1 THEN (t_interval(REPORT_Interval_Get_udf((startTime + ((ROWNUM-1) * 1/24)), inputInterval))) --Hour
         WHEN 2 THEN (t_interval(REPORT_Interval_Get_udf((startTime + (ROWNUM-1)), inputInterval))) --Day
         WHEN 3 THEN (t_interval(REPORT_Interval_Get_udf((startTime + ((ROWNUM-1)*7)), inputInterval))) --Week
            END 
          BULK COLLECT INTO intervalList
          FROM dual CONNECT BY LEVEL <= (CASE inputInterval 
                                          WHEN 1 THEN CAST(CEIL(((TRUNC(endTime, 'HH') - TRUNC(startTime, 'HH')) * 24)) AS INT)
                                          WHEN 2 THEN CAST(TRUNC(endTime, 'DD') - TRUNC(startTime, 'DD') AS INT)
                                          WHEN 3 THEN CAST(CEIL(((TRUNC(endTime, 'DD') - TRUNC(startTime, 'DD')) )/7) AS INT)
                                       END);
      RETURN intervalList; 

    END GET_INTERVAL_SEQUENCE_UDF;


CREATE OR REPLACE FUNCTION      Get_Context_Info_udf
    RETURN TTRFRMENGMACHINEINFOLIST AS vResult TTRFRMENGMACHINEINFOLIST;
    BEGIN    
        SELECT TTrfrmEngMachineInfo(ch.Context_ID, mac.Name)
        BULK COLLECT INTO vResult   
        FROM
            a ch  
        INNER JOIN
            b cxm  ON ch.CONTX_MACHINE_ID = cxm.CONTX_MACHINE_ID   
        INNER JOIN
            c mac ON cxm.MACHINE_ID = mac.MACHINE_ID   
        INNER JOIN
            d ic  ON mac.MACHINE_ID = ic.MACHINE_ID  
        WHERE 
            ic.ONFIGURABLE_ENTITY_ID =  HEXTORAW(Format_Guid_udf('11111111-FAE9-47A1-91A9-60A53E9660FE'))
            AND mac.IS_DELETED = 'N'
            AND ic.IS_DELETED = 'N';

        RETURN vResult; 
     END;

2 个答案:

答案 0 :(得分:1)

这一切对我来说都很陌生:)

首先,SELECT FROM DUAL在PL / SQL中是不常见的。

CREATE OR REPLACE FUNCTION Interval_GetT(datestamp IN timestamp,  timeInterval IN int) 
RETURN TReportIntervalList AS vResult TReportIntervalList;
BEGIN
     SELECT TReportInterval(
              CASE timeInterval 
                     WHEN 1 THEN TO_CHAR(datestamp, 'YYYY-MM-DD HH24') 
                     WHEN 2 THEN TO_CHAR(datestamp, 'YYYY-MM-DD')
                     WHEN 3 THEN TO_CHAR(datestamp, 'YYYY-WW')
              END) 
     BULK COLLECT INTO vResult                                       
     FROM Dual WHERE ROWNUM = 1;

     RETURN vResult;
END;

将以更简单的方式完成

CREATE OR REPLACE FUNCTION Interval_GetT(datestamp IN timestamp,  timeInterval IN int) 
RETURN TReportIntervalList;
BEGIN
  IF timeInterval  = 1 THEN
       RETURN TReportInterval(TO_CHAR(datestamp, 'YYYY-MM-DD HH24'));
  ELSIF timeInterval  = 2 THEN
       RETURN TReportInterval(TO_CHAR(datestamp, 'YYYY-MM-DD'));
  ELSIF timeInterval  = 3 THEN
       RETURN TReportInterval(TO_CHAR(datestamp, 'YYYY-WW'));
  ELSE
       RETURN NULL;
  END IF;
END;

无法分辨TReportInterval的作用,因此很难知道该模块的作用。 我将看一个PIPELINED PL / SQL函数来替换GET_INTERVAL_SEQUENCE_UDF。你将面临的困难是,优化器永远不会知道它会返回多少行,因此通常会猜错。

Get_Context_Info_udf会出现类似的问题。没有明显的迹象表明它会返回1行还是10,000行。 TTrfrmEngMachineInfo再次完全不透明。

直言不讳地说,正在做的一切都是为了让优化器不知道如何最好地处理查询。

如果统计表是主要表,我认为您正在根据

过滤表
 WHERE
      StatisticTypeID = HEXTORAW ('6CF933B091AE46FEA7F56BE96308190F') 
      AND EntryDate < TO_TIMESTAMP ('07/20/2011','mm/dd/yyyy') 
      AND EntryDate > TO_TIMESTAMP ('07/15/2011', 'mm/dd/yyyy')

总结基于Context_ID的value_numeric。

可能是某个日期维度的摘要(可能是每日/每周/每月总计?)

我试着尽可能多地摆脱PL / SQL。首先是对统计数据的简单查询,并描述您希望在每个阶段执行的操作。

答案 1 :(得分:1)

您可以按照this OTN thread

中的建议调查Oracle选择的时间和执行计划

的问候,
罗布。