Question

我需要找到每天的年龄，但我需要在一个查询中的所有先前日期。所以我使用了以下查询：

 select trunc(sysdate) - level + 1 **DATE**
 ,trunc(sysdate) - level + 1 - created_date **AGE**   from items
connect by trunc(sysdate) - level + 1 - created_date > 0

我正在获得输出（FOR DATE＆amp; AGE），这是正确的：

 DATE               AGE
   --------- ----------
   6-JUL-15          22
   5-JUL-15          21
   4-JUL-15          20
   3-JUL-15          19
   2-JUL-15          18
   1-JUL-15          17
   30-JUN-15         16
   29-JUN-15         15
   28-JUN-15         14
   27-JUN-15         13
   26-JUN-15         12
   25-JUN-15         11
   24-JUN-15         10

现在我需要计算每天的平均年龄，因此我在以下查询中添加了平均值：

  select trunc(sysdate) - level + 1 **DATE** ,
    **avg**(trunc(sysdate) - level + 1 - created_date )** AVERAGE_AGE**   
    from items
    connect by trunc(sysdate) - level + 1 - created_date > 0
    group by trunc(sysdate) - level + 1

此查询是否正确？当我将聚合函数（avg）添加到此查询时，需要1小时来检索数据。当我从查询中删除平均函数时，它会在2秒内得出结果吗？在不影响性能的情况下计算平均值的可能解决方案是什么？

Answer 1

抱歉，我从未使用过Oracle，因此即使我尝试阅读文档中的语法详细信息，也可能会出现一些错误：

你说这个查询在2秒内完成了工作：

select trunc(sysdate) - level + 1 **DATE**
 ,trunc(sysdate) - level + 1 - created_date **AGE**   from items
connect by trunc(sysdate) - level + 1 - created_date > 0

所以我们会保留它并从中创建一个view：

CREATE OR REPLACE VIEW my_view AS
(select 
    trunc(sysdate) - level + 1 **DATE** AS "date_col",
    trunc(sysdate) - level + 1 - created_date **AGE** AS "age_col"  
from items
connect by trunc(sysdate) - level + 1 - created_date > 0);

但可能我们可以通过以下方式获得一些冗余计算：

CREATE OR REPLACE VIEW distinct_dates AS 
(
SELECT DISTINCT trunc(sysdate) - level + 1 AS "date_distinct"
from items
connect by trunc(sysdate) - level + 1 - created_date > 0
);

CREATE OR REPLACE VIEW my_view AS
(select 
    date_distinct AS "date_col",
    date_distinct - created_date AS "age_col"  
from distinct_dates
connect by date_distinct - created_date > 0);

为什么我这样做？因为似乎问题来自聚合，我担心视图实际上是在代码中多次计算的。下一步就是计算视图：

select 
    date_col ,
    AVG(age_col)
from my_view
group by date_col;

为了得出结论，最终的代码是：

CREATE OR REPLACE VIEW distinct_dates AS 
(
SELECT DISTINCT trunc(sysdate) - level + 1 AS "date_distinct"
from items
connect by trunc(sysdate) - level + 1 - created_date > 0
);

CREATE OR REPLACE VIEW my_view AS
(select 
    date_distinct AS "date_col",
    date_distinct - created_date AS "age_col"  
from distinct_dates
connect by date_distinct - created_date > 0);

select 
    date_col ,
    AVG(age_col)
from my_view
group by date_col;

或者如果它不起作用：

CREATE OR REPLACE VIEW my_view AS
(select 
    trunc(sysdate) - level + 1 **DATE** AS "date_col",
    trunc(sysdate) - level + 1 - created_date **AGE** AS "age_col"  
from items
connect by trunc(sysdate) - level + 1 - created_date > 0);


select 
    date_col ,
    AVG(age_col)
from my_view
group by date_col;

Answer 2

修改后的查询：

select tdate, avg(trunc(tdate)-created_date) AVG_AGE
  from (
    select trunc(sysdate) - level + 1 tdate
      from (select min(created_date) dt from items)
      connect by trunc(sysdate) - level + 1 - dt > 0 ) dates
  join items on dates.tdate > items.created_date
  group by tdate order by tdate

^{SQLFiddle demo}

假设您只有两行日期为“2015-06-01”和“2015-06-20”。根据我的计算，您的分层查询为它们生成1376254行，这可能不是您想要的，它应该生成51行（35 + 16）。这就是为什么它需要这么长时间，因为表items中的更多行输出呈指数增长。

您可以通过添加某种计数器（由rownum或row_number生成）来修改您的查询然后将and prior rn = rn添加到connect by子句，但上面显示的查询使其更简单。我在SQLFiddle中添加了第二个查询来比较结果，两者都产生相同的输出。

如何优化此SQL查询的性能

2 个答案: