为什么MySQL优化器不使用所有列索引?

时间:2018-03-03 11:26:11

标签: mysql sql database optimization percona

Percona MySQL 5.7

表scheeme:

  onBindViewHolder(ViewHolder holder, int position)
   {
       if(position == yourList.size()-1)
         { 
             loadMoreItems();
             notifyDataSetChanged();
          }
   }

我尝试在选定期间之前请求最新数据。 optimazer使用no-complete唯一键,只有2列3。

如果我以共同的方式提出要求:

CREATE TABLE Developer.Rate (
  ID bigint(20) UNSIGNED NOT NULL AUTO_INCREMENT,
  TIME datetime NOT NULL,
  BASE varchar(3) NOT NULL,
  QUOTE varchar(3) NOT NULL,
  BID double NOT NULL,
  ASK double NOT NULL,
  PRIMARY KEY (ID),
  INDEX IDX_TIME (TIME),
  UNIQUE INDEX IDX_UK (BASE, QUOTE, TIME)
)
ENGINE = INNODB
ROW_FORMAT = COMPRESSED;

“Explain”表示只使用了2个第一列索引:BASE,QUOTE

EXPLAIN FORMAT=JSON
SELECT
  BID
FROM 
  Rate
WHERE 
  BASE = 'EUR' 
  AND QUOTE = 'USD' 
  AND `TIME` <= (NOW() - INTERVAL 1 MONTH) 
ORDER BY 
  `TIME` DESC 
LIMIT 1
;

但是如果强制优化器使用IDX_UK,MySQL会使用请求中的所有3列:

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "10231052.40"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "Rate",
        "access_type": "ref",
        "possible_keys": [
          "IDX_UK",
          "IDX_TIME"
        ],
        "key": "IDX_UK",
        "used_key_parts": [
          "BASE",
          "QUOTE"
        ],
        "key_length": "22",
        "ref": [
          "const",
          "const"
        ],
        "rows_examined_per_scan": 45966462,
        "rows_produced_per_join": 22983231,
        "filtered": "50.00",
        "cost_info": {
          "read_cost": "1037760.00",
          "eval_cost": "4596646.20",
          "prefix_cost": "10231052.40",
          "data_read_per_join": "1G"
        },
        "used_columns": [
          "ID",
          "TIME",
          "BASE",
          "QUOTE",
          "BID"
        ],
        "attached_condition": "((`Developer`.`Rate`.`BASE` <=> 'EUR') and (`Developer`.`Rate`.`QUOTE` <=> 'USD') and (`Developer`.`Rate`.`TIME` <= <cache>((now() - interval 1 month))))"
      }
    }
  }
}

为什么优化器在没有明确声明索引的情况下不使用所有3列?

  

添加了:

我理解正确,我应该使用这样的请求吗?

  

Reuest示例:

EXPLAIN FORMAT=JSON
SELECT
  BID
FROM 
  Rate FORCE INDEX(IDX_UK)
WHERE 
  BASE = 'EUR' 
  AND QUOTE = 'USD' 
  AND `TIME` <= (NOW() - INTERVAL 1 MONTH) 
ORDER BY 
  `TIME` DESC 
LIMIT 1

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "10231052.40"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "Rate",
        "access_type": "range",
        "possible_keys": [
          "IDX_UK"
        ],
        "key": "IDX_UK",
        "used_key_parts": [
          "BASE",
          "QUOTE",
          "TIME"
        ],
        "key_length": "27",
        "rows_examined_per_scan": 45966462,
        "rows_produced_per_join": 15320621,
        "filtered": "100.00",
        "index_condition": "((`Developer`.`Rate`.`BASE` = 'EUR') and (`Developer`.`Rate`.`QUOTE` = 'USD') and (`Developer`.`Rate`.`TIME` <= <cache>((now() - interval 1 month))))",
        "cost_info": {
          "read_cost": "1037760.00",
          "eval_cost": "3064124.31",
          "prefix_cost": "10231052.40",
          "data_read_per_join": "818M"
        },
        "used_columns": [
          "ID",
          "TIME",
          "BASE",
          "QUOTE",
          "BID"
        ]
      }
    }
  }
}

如果我理解正确,那么Explain的输出就不会更好。仍然只有2列没有使用TIME

  

解释输出

EXPLAIN FORMAT=JSON SELECT BID FROM Rate WHERE BASE = 'EUR' AND QUOTE = 'USD' AND `TIME` <= (NOW() - INTERVAL 1 MONTH) ORDER BY BASE DESC, QUOTE DESC, TIME DESC LIMIT 1

  

新增2:

我做了这4个请求:

- 1 -

{
      "query_block": {
        "select_id": 1,
        "cost_info": {
          "query_cost": "10384642.20"
        },
        "ordering_operation": {
          "using_filesort": false,
          "table": {
            "table_name": "Rate",
            "access_type": "ref",
            "possible_keys": [
              "IDX_UK",
              "IDX_TIME"
            ],
            "key": "IDX_UK",
            "used_key_parts": [
              "BASE",
              "QUOTE"
            ],
            "key_length": "22",
            "ref": [
              "const",
              "const"
            ],
            "rows_examined_per_scan": 46734411,
            "rows_produced_per_join": 23367205,
            "filtered": "50.00",
            "index_condition": "((Developer.Rate.BASE <=> 'EUR') and (Developer.Rate.QUOTE <=> 'USD') and (Developer.Rate.TIME <= ((now() - interval 1 month))))",
            "cost_info": {
              "read_cost": "1037760.00",
              "eval_cost": "4673441.10",
              "prefix_cost": "10384642.20",
              "data_read_per_join": "1G"
            },
            "used_columns": [
              "ID",
              "TIME",
              "BASE",
              "QUOTE",
              "BID"
            ]
          }
        }
      }
    }


- 2 -

Developer

- 3 -

Rate

- 4 -

BASE

session_status的输出在除请求3之外的所有请求中是相同的。在请求3的输出中:Handler_read_prev = 486474; 输出所有其他请求:Handler_read_prev = 0;

Handler_read_prev

  

新增3:

我复制了表格,删除了Id字段,将UNIQUE键提升为PRIMARY。

计划:

Developer

现在请求确实有效,而Explain显示使用了所有3列。这种变体有效。

2 个答案:

答案 0 :(得分:1)

摆脱ID,没用。将您的UNIQUE密钥提升为PRIMARY。现在,奇迹般地,查询会更快,你提出的问题将变得毫无意义。 (您可能还需要洛林建议的DESC技巧。)

这是另一种比较性能的技术:

FLUSH STATUS;
SELECT ...;
SHOW SESSION STATUS LIKE 'Handler%';

我有兴趣看到SHOW的输出有和没有DESC技巧。有或没有您提到的FORCE INDEX

为什么更快?您的查询使用的是辅助索引,但它需要bid,而索引并未对其进行“覆盖”。要获得bid,需要在'数据'中向下钻取PRIMARY KEY。通过更改它以便使用PK,可以避免这种额外的向下钻取。

答案 1 :(得分:1)

您描述的行为(ref访问而不是更多列的范围访问)让我想起了Bug#81341Bug#87613。这些错误分别在MySQL 5.7.17和5.7.21中修复。你使用的是哪个版本?