我的应用程序的一部分存在一个巨大的问题。我正在使用SQLAlchemy和MySQL组合,并且大多数工作正常,但有一个痒,永远加载,有时甚至5-6分钟,加载客户列表。该表有大约3000行,对于数据库标准来说应该相当小,而且我在一个稍大的表(25k行)上有一个简单的连接。
SQL Alchemy中的查询如下:
last_inv = db.session.query(Sales.id).order_by(Sales.invoice_date.desc()).filter(Customer.email == Sales.email).limit(1).correlate(Customer)
results = db.session.query(Customer, last_inv.as_scalar()).filter_by(archive=0)
原始SQL看起来像这样:
SELECT customer.id AS customer_id
, customer.first_name AS customer_first_name
, customer.middle_name AS customer_middle_name
, customer.last_name AS customer_last_name
, customer.email AS customer_email
, customer.password AS customer_password
, customer.address1 AS customer_address1
, customer.address2 AS customer_address2
, customer.city AS customer_city
, customer.state AS customer_state
, customer.zip AS customer_zip
, customer.country AS customer_country
, customer.phone AS customer_phone
, customer.cell_phone AS customer_cell_phone
, customer.current_plan AS customer_current_plan
, customer.minutes_current_plan AS customer_minutes_current_plan
, customer.orig_sales_id AS customer_orig_sales_id
, customer.sales_id AS customer_sales_id
, customer.team_id AS customer_team_id
, customer.refill_date AS customer_refill_date
, customer.minutes_refill_date AS customer_minutes_refill_date
, customer.active AS customer_active
, customer.archive AS customer_archive
, customer.imported AS customer_imported
, customer.ipaddress AS customer_ipaddress
, customer.auto_renewal AS customer_auto_renewal
, customer.signup_date AS customer_signup_date
, customer.esn AS customer_esn
, customer.last_update_date AS customer_last_update_date
, customer.last_update_by AS customer_last_update_by
, customer.notes AS customer_notes
, customer.current_pin AS customer_current_pin
, customer.minutes_current_pin AS customer_minutes_current_pin
, customer.security_pin AS customer_security_pin
, (SELECT sales.id
FROM sales
WHERE customer.email = sales.email
ORDER
BY sales.invoice_date DESC LIMIT 1) AS anon_1
FROM customer
WHERE customer.team_id = 1
AND customer.archive = 0
我尝试了很多东西,但这真的开始让我感到绝望。这一切都在亚马逊上运行,htop
在运行时显示100%的mysql使用率。关于phpmyadmin的查询的分析器,HeidiSQL显示这是在不到两秒的时间内完成的(当没有被cahce命中时),所以它不是导致这个的实际查询(就像我理解这一点一样公平)。
这是EXPLAIN
显示的内容:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY customer ALL NULL NULL NULL NULL 3621 Using where
2 DEPENDENT SUBQUERY sales ALL NULL NULL NULL NULL 22619 Using where; Using filesort
来自phpmyadmin的探查器是here和可视化表示here。
我在EC2上运行一个m1.small实例,内存为1650MB。
我也运行了一个mysqlprofiler,这里是结果before和after我所做的优化。我的my.cnf
文件为here。
我尝试在表上运行OPTIMIZE
,但由于某种原因,未经优化的表的数量总是98,所以我想我做错了。我使用this脚本,以及phpmyadmin中的原始sql,但没有成功。
答案 0 :(得分:2)
尝试创建此多列索引,这可以加快查询速度:
CREATE INDEX sales_eml_invdat ON sales( email, invoice_date );
甚至是三列
CREATE INDEX sales_eml_invdat_id ON sales( email, invoice_date, id );
但仅限于id
不是主键列的情况
如果id
是主键,那么前一个索引就足够了。
----编辑------
对不起,我忘了MySql不像其他DBMS那么聪明
它本身无法检测到这种情况,必须明确告诉他如何做到这一点
请将子查询重新编入:
SELECT sales.id
FROM sales
WHERE customer.email = sales.email
ORDER BY sales.email DESC, sales.invoice_date DESC
LIMIT 1
此更改使MySql能够使用( email, invoice_date )
索引跳过文件排序,请尝试使用。