如何优化此SQL查询的性能

时间:2015-07-06 08:47:28

标签: sql oracle

我需要找到每天的年龄,但我需要在一个查询中的所有先前日期。所以我使用了以下查询:

 select trunc(sysdate) - level + 1 **DATE**
 ,trunc(sysdate) - level + 1 - created_date **AGE**   from items
connect by trunc(sysdate) - level + 1 - created_date > 0

我正在获得输出(FOR DATE& AGE),这是正确的:

 DATE               AGE
   --------- ----------
   6-JUL-15          22
   5-JUL-15          21
   4-JUL-15          20
   3-JUL-15          19
   2-JUL-15          18
   1-JUL-15          17
   30-JUN-15         16
   29-JUN-15         15
   28-JUN-15         14
   27-JUN-15         13
   26-JUN-15         12
   25-JUN-15         11
   24-JUN-15         10    

现在我需要计算每天的平均年龄,因此我在以下查询中添加了平均值:

  select trunc(sysdate) - level + 1 **DATE** ,
    **avg**(trunc(sysdate) - level + 1 - created_date )** AVERAGE_AGE**   
    from items
    connect by trunc(sysdate) - level + 1 - created_date > 0
    group by trunc(sysdate) - level + 1

此查询是否正确?当我将聚合函数(avg)添加到此查询时,需要1小时来检索数据。当我从查询中删除平均函数时,它会在2秒内得出结果吗?在不影响性能的情况下计算平均值的可能解决方案是什么?

2 个答案:

答案 0 :(得分:0)

抱歉,我从未使用过Oracle,因此即使我尝试阅读文档中的语法详细信息,也可能会出现一些错误:

你说这个查询在2秒内完成了工作:

select trunc(sysdate) - level + 1 **DATE**
 ,trunc(sysdate) - level + 1 - created_date **AGE**   from items
connect by trunc(sysdate) - level + 1 - created_date > 0

所以我们会保留它并从中创建一个view

CREATE OR REPLACE VIEW my_view AS
(select 
    trunc(sysdate) - level + 1 **DATE** AS "date_col",
    trunc(sysdate) - level + 1 - created_date **AGE** AS "age_col"  
from items
connect by trunc(sysdate) - level + 1 - created_date > 0);

但可能我们可以通过以下方式获得一些冗余计算:

CREATE OR REPLACE VIEW distinct_dates AS 
(
SELECT DISTINCT trunc(sysdate) - level + 1 AS "date_distinct"
from items
connect by trunc(sysdate) - level + 1 - created_date > 0
);

CREATE OR REPLACE VIEW my_view AS
(select 
    date_distinct AS "date_col",
    date_distinct - created_date AS "age_col"  
from distinct_dates
connect by date_distinct - created_date > 0);

为什么我这样做?因为似乎问题来自聚合,我担心视图实际上是在代码中多次计算的。下一步就是计算视图:

select 
    date_col ,
    AVG(age_col)
from my_view
group by date_col;

为了得出结论,最终的代码是:

CREATE OR REPLACE VIEW distinct_dates AS 
(
SELECT DISTINCT trunc(sysdate) - level + 1 AS "date_distinct"
from items
connect by trunc(sysdate) - level + 1 - created_date > 0
);

CREATE OR REPLACE VIEW my_view AS
(select 
    date_distinct AS "date_col",
    date_distinct - created_date AS "age_col"  
from distinct_dates
connect by date_distinct - created_date > 0);

select 
    date_col ,
    AVG(age_col)
from my_view
group by date_col;

或者如果它不起作用:

CREATE OR REPLACE VIEW my_view AS
(select 
    trunc(sysdate) - level + 1 **DATE** AS "date_col",
    trunc(sysdate) - level + 1 - created_date **AGE** AS "age_col"  
from items
connect by trunc(sysdate) - level + 1 - created_date > 0);


select 
    date_col ,
    AVG(age_col)
from my_view
group by date_col;

答案 1 :(得分:0)

修改后的查询:

select tdate, avg(trunc(tdate)-created_date) AVG_AGE
  from (
    select trunc(sysdate) - level + 1 tdate
      from (select min(created_date) dt from items)
      connect by trunc(sysdate) - level + 1 - dt > 0 ) dates
  join items on dates.tdate > items.created_date
  group by tdate order by tdate

SQLFiddle demo

假设您只有两行日期为“2015-06-01”和“2015-06-20”。 根据我的计算,您的分层查询为它们生成1376254行,这可能不是您想要的, 它应该生成51行(35 + 16)。这就是为什么它需要这么长时间,因为表items中的更多行输出呈指数增长。

您可以通过添加某种计数器(由rownumrow_number生成)来修改您的查询 然后将and prior rn = rn添加到connect by子句,但上面显示的查询使其更简单。 我在SQLFiddle中添加了第二个查询来比较结果,两者都产生相同的输出。