Question

我在子查询中使用了 3 个表，如下所示。

INSERT INTO TABLE_A (SELECT * FROM TABLE_B WHERE (X,Y) NOT IN (SELECT X,Y FROM TABLE_A));

但是运行查询需要很长时间，因为表有 400,000 到 500,000 行。

然而，当我执行以下查询时，不需要太多时间。

INSERT INTO TABLE_A (SELECT * FROM TABLE_B WHERE X||Y NOT IN (SELECT X||Y FROM TABLE_A));

看到执行时间后，我怀疑两者是否相同。

为什么一个比另一个慢？
这些查询是否相同？

Answer 1

The key here is the "INTERNAL FUNCTION" in the plan, which probably means you are comparing columns with different data types (which is always a bad idea)

For example


SQL> create table t1 ( x int, y int );

Table created.

SQL> create table t2 ( x varchar2(10), y varchar2(10));

Table created.

SQL>
SQL> insert into t1 select rownum,rownum from dual
  2  connect by level <= 1000;

1000 rows created.

SQL>
SQL> insert into t2 select rownum,rownum from dual
  2  connect by level <= 1000;

1000 rows created.

SQL> set autotrace traceonly explain
SQL> select *
  2  from t1
  3  where (x,y) not in (select x,y from t2);

Execution Plan
----------------------------------------------------------
Plan hash value: 2177415756

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |  1000 | 16000 |     8  (25)| 00:00:01 |
|   1 |  MERGE JOIN ANTI NA |      |  1000 | 16000 |     8  (25)| 00:00:01 |
|   2 |   SORT JOIN         |      |  1000 |  8000 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T1   |  1000 |  8000 |     3   (0)| 00:00:01 |
|*  4 |   SORT UNIQUE       |      |  1000 |  8000 |     4  (25)| 00:00:01 |
|   5 |    TABLE ACCESS FULL| T2   |  1000 |  8000 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access(INTERNAL_FUNCTION("X")=TO_NUMBER("X") AND
              INTERNAL_FUNCTION("Y")=TO_NUMBER("Y"))
       filter(INTERNAL_FUNCTION("Y")=TO_NUMBER("Y") AND
              INTERNAL_FUNCTION("X")=TO_NUMBER("X"))

我们必须先清理数据类型，然后才能进行正确的连接，因此我们没有使用散列连接。

当您将其更改为串联时，该运算符将所有内容都设为字符串，因此可以使用散列连接。

SQL> select *
  2  from t1
  3  where (x||y) not in (select x||y from t2);

Execution Plan
----------------------------------------------------------
Plan hash value: 1275484728

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   990 | 15840 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN ANTI NA |      |   990 | 15840 |     6   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T1   |  1000 |  8000 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T2   |  1000 |  8000 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(TO_CHAR("X")||TO_CHAR("Y")="X"||"Y")

SQL>

但正如威廉所指出的那样，您可能会面临结果不准确的风险。

查询之间的差异

1 个答案: