Oracle: Max, Partition by, or even rownum?

时间:2017-05-16 09:19:54

标签: oracle subquery query-optimization rownum

I am from SQL Server background, so my skill in Oracle is minimum. It looks like Partition by is better than max in terms of performance. Or do I use rownum to archive my result table?

I have the following table - TableW.

| P_TYPE      | TRX_DATE       | PROGRAM_NO | REF_NO    | SEQ_ID |  Select 
|-------------|----------------|------------|-----------|--------|
| 'Local'     | 2016/9/5 14:37 | C1         | null      | E1     |  Yes (latest in Sept 5)
| 'Local'     | 2016/9/5 14:36 | C1         | null      | E1     |
| 'Local'     | 2016/9/5 11:08 | C1         | null      | E1     |
|-------------|----------------|------------|-----------|--------|
| 'Local'     | 2016/9/2 15:16 | C1         | null      | E1     |  Yes (latest in Sept 2)
|-------------|----------------|------------|-----------|--------|
| 'Local'     | 2016/9/1 15:20 | C1         | null      | E1     |  Yes (latest in Sept 1)
| 'Local'     | 2016/9/1 14:33 | C1         | null      | E1     |
|-------------|----------------|------------|-----------|--------|
| '3rd Party' | 2016/9/4 18:00 | null       | D1        | E2     |  Yes
| '3rd Party' | 2016/9/4 17:55 | null       | D1        | E2     |

Here is what I want to get:

For column P_TYPE, if it is value of 'Local', use column PROGRAM_NO and SEQ_ID. Otherwise, use REF_NO and SEQ_ID. If value in column P_TYPE is same, check TRX_DATE. If column TRX_DATE indicates the same date, pick the one with latest time stamp. Another day? Another entry with latest time stamp.

| P_TYPE      | TRX_DATE       | PROGRAM_NO | REF_NO    | SEQ_ID |
|-------------|----------------|------------|-----------|--------|
| 'Local'     | 2016/9/5 14:37 | C1         | null      | E1     |  
| 'Local'     | 2016/9/2 15:16 | C1         | null      | E1     |  
| 'Local'     | 2016/9/1 15:20 | C1         | null      | E1     |  
| '3rd Party' | 2016/9/4 18:00 | null       | D1        | E2     |  

A script that I receive is to use SELECT MAX in the WHERE clause:

SELECT *
FROM TableW a
WHERE TRX_DATE = 
    CASE P_TYPE
        WHEN 'Local' THEN
            (SELECT MAX(TRX_DATE) FROM TableW
                WHERE PROGRAM_NO = a.PROGRAM_NO AND SEQ_ID = a.SEQ_ID)
        ELSE
            (SELECT MAX(TRX_DATE) FROM TableW
                WHERE REF_NO = a.REF_NO AND SEQ_ID = a.SEQ_ID)
    END
ORDER BY TRX_DATE desc, REF_NO ASC, SEQ_ID;

It does the job. However with some research, it seems like partition by is not that costly. Refer: Tune SQL statement with max subquery

I try to rewrite the query as:

SELECT *
FROM (
SELECT *,
    CASE P_TYPE 
        WHEN 'Local' THEN 
            MAX(TRX_DATE) OVER (PARTITION BY PROGRAM_NO, SEQ_ID)
        ELSE
            MAX(TRX_DATE) OVER (PARTITION BY REF_NO, SEQ_ID)
    END AS MAX_TRX_DATE
FROM TableW
WHERE P_TYPE = 'Local'
)
WHERE TRX_DATE = MAX_TRX_DATE

However, I get only this:

| P_TYPE      | TRX_DATE       | PROGRAM_NO | REF_NO    | SEQ_ID |
|-------------|----------------|------------|-----------|--------|
| 'Local'     | 2016/9/5 14:37 | C1         | null      | E1     |  

Any guideline please. If possible, please illustrate your suggestion with statistics. Thanks.

EDIT: It looks like using row_number and partition by will greatly reduce the execution plan and even the time?

| CASE             | OPERATION        | CARDINALITY | COST | LAST CR     | LAST ELAPSED  |
|                  |                  |             |      | BUFFER GETS | TIME          |
|------------------|------------------|-------------|------|-------------|---------------|
| 1 - max() in     | SELECT STATEMENT |             |  76  |             |               |
|     where clause | SORT (ORDER BY)  |      1      |  76  |     477     |      3602     |
|------------------|------------------|-------------|------|-------------|---------------|
| 2 - row_number   | SELECT STATEMENT |             |  18  |             |               |
|                  | SORT (ORDER BY)  |      8      |  18  |      53     |       607     |
|------------------|------------------|-------------|------|-------------|---------------|

1 个答案:

答案 0 :(得分:2)

对于Local行,您需要包含定义窗口分区的日期,因为PROGRAM_NO, REF_NO的所有值对于这些行都是相同的:

select *
from (
  SELECT *,
         CASE P_TYPE
           when 'Local' then 
              row_number() over (partition by program_no, seq_id, trunc(trx_date) order by trx_date desc)
           else 
              row_number() over (partition by ref_no, seq_id order by trx_date desc)
         end as rn
  FROM TableW a
) t
where rn = 1;

在线示例:http://rextester.com/CZTY80559

(该示例使用Postgres,但除了“忽略”时间戳的时间部分的不同方式之外,它在Oracle中将是相同的)