通过删除“存在”和“不存在”来优化Oracle查询

时间:2013-04-25 15:10:52

标签: sql oracle

我最近在oracle数据库中将一段代码移到了生产环境中,其中一位经验丰富的开发人员提到了我提到的existsnot exists语句过多,而且应该有一个删除它们的方法,但它已经太久了,因为他不得不使用它并且不记得它是如何工作的。目前,我正在回过头来使代码片段更易于维护,因为随着业务逻辑/需求的变化,它可能会在未来几年内多次更改,我希望继续优化它,同时使其更易于维护

我已经尝试查找了,但我能找到的建议是将not in替换为not exists并且不返回实际结果。

因此,我想知道可以采取哪些措施来优化exists / not exists,或者是否有办法编写exists / not exists以便oracle将在内部对其进行优化(可能比我更好)。

例如,如何优化以下内容?

UPDATE
    SCOTT.TABLE_N N
SET
    N.VALUE_1 = 'Data!'
WHERE
    N.VALUE_2 = 'Y'
    AND
    EXISTS
    (
        SELECT
            1
        FROM
            SCOTT.TABLE_Q Q
        WHERE
            N.ID = Q.N_ID
    )
    AND
    NOT EXISTS
    (
        SELECT
            1
        FROM
            SCOTT.TABLE_W W
        WHERE
            N.ID = W.N_ID
    )

3 个答案:

答案 0 :(得分:8)

你的陈述对我来说似乎很好。

在任何优化任务中,不要考虑模式。不要认为,“(not) exists是坏的,慢的,(not) in非常酷而且快”。

想想,数据库在每个步骤上做了多少工作以及如何衡量它?

一个简单的例子:

- 不在:

23:59:41 HR@sandbox> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.03
23:59:43 HR@sandbox> set autotrace traceonly explain statistics
23:59:49 HR@sandbox> select country_id from countries where country_id not in (select country_id from locations);

11 rows selected.

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 1748518851

------------------------------------------------------------------------------------------
| Id  | Operation              | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                 |     1 |     6 |     4   (0)| 00:00:01 |
|*  1 |  FILTER                |                 |       |       |            |          |
|   2 |   NESTED LOOPS ANTI SNA|                 |    11 |    66 |     4  (75)| 00:00:01 |
|   3 |    INDEX FULL SCAN     | COUNTRY_C_ID_PK |    25 |    75 |     1   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN    | LOC_COUNTRY_IX  |    13 |    39 |     0   (0)| 00:00:01 |
|*  5 |   TABLE ACCESS FULL    | LOCATIONS       |     1 |     3 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "LOCATIONS" "LOCATIONS" WHERE
              "COUNTRY_ID" IS NULL))
   4 - access("COUNTRY_ID"="COUNTRY_ID")
   5 - filter("COUNTRY_ID" IS NULL)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         11  consistent gets
          8  physical reads
          0  redo size
        446  bytes sent via SQL*Net to client
        363  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         11  rows processed

- NOT EXISTS

23:59:57 HR@sandbox> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.17
00:00:02 HR@sandbox> select country_id from countries c where not exists (select 1 from locations l where l.country_id = c.country_id );

11 rows selected.

Elapsed: 00:00:00.30

Execution Plan
----------------------------------------------------------
Plan hash value: 840074837

-------------------------------------------------------------------------------------
| Id  | Operation         | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                 |    11 |    66 |     1   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI|                 |    11 |    66 |     1   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN | COUNTRY_C_ID_PK |    25 |    75 |     1   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN| LOC_COUNTRY_IX  |    13 |    39 |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("L"."COUNTRY_ID"="C"."COUNTRY_ID")


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          2  physical reads
          0  redo size
        446  bytes sent via SQL*Net to client
        363  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         11  rows processed

此示例中的NOT IN读取两倍的数据库块并执行更复杂的过滤 - 问问自己,为什么要在NOT EXISTS上选择它?

答案 1 :(得分:2)

当您需要时,没有理由避免使用EXISTS或NOT EXISTS。在您给出的示例中,这可能正是您要使用的内容。

典型的困境是使用IN / NOT IN还是EXISTS / NOT EXISTS。它们的评估方式完全不同,根据您的具体情况,可能会更快或更慢。

有关详细信息,请参阅here

答案 2 :(得分:1)

我不知道它是否更快,但这是一种不用EXISTS / NOT EXISTS来编写它的方法:

MERGE INTO TABLE_N T
USING (
  SELECT N.ID, 'Data!' AS NEW_VALUE_1
  FROM SCOTT.TABLE_N N
  INNER JOIN SCOTT.TABLE_Q Q
      ON Q.N_ID = N.ID
  LEFT JOIN SCOTT.TABLE_W W
      ON W.N_ID = N.ID
  WHERE N.VALUE_2 = 'Y'
  AND W.ID IS NULL
) X
ON ( T.ID = X.ID )
WHEN MATCHED THEN UPDATE
    SET T.VALUE_1 = X.NEW_VALUE_1;