从自托管的PostgreSQL实例迁移到Amazon RDS后,我们在查询12M行表时遇到了一些奇怪的问题。
以前可以运行的查询现在被wait_event_type=LWLockTranche
和wait_event=buffer_io
锁定(即使是最简单的查询,也没有任何JOIN)。所有的索引和执行计划似乎都不错。除查询执行时间外,Explain Analyze并没有显示任何异常。
示例查询:
explain (verbose, buffers, analyze) SELECT * FROM "products_product" WHERE ("products_product"."category_id" = 43);
自托管的PostgreSQL的结果:
Index Scan using products_product_b583a629 on public.products_product (cost=0.43..5256.40 rows=5667 width=1758) (actual time=24.372..298.822 rows=29342 loops=1)
Output: id, title, description, image_path, image_source_url, website_source, date_created, date_updated, afi_url, afi_price_currency, afi_recognize_id, afi_price, afi_old_price_currency, meta_link, afi_old_price, meta_published, meta_admin_note, afi_id, brand_id, category_id, retailer_id, afi_promotion, afi_stock, search_vector, original_category_id, search_vector_pl, title
_pl, description_pl, owner_id
Index Cond: (products_product.category_id = 43)
Buffers: shared hit=71 read=22261
I/O Timings: read=233.266
Planning time: 0.271 ms
Execution time: 310.205 ms
,以及来自Amazon RDS的相同查询的结果:
Index Scan using products_product_b583a629 on public.products_product (cost=0.43..27905.30 rows=30563 width=1753) (actual time=26.084..179652.029 rows=29342 loops=1)
Output: id, title, description, image_path, image_source_url, website_source, date_created, date_updated, afi_url, afi_price_currency, afi_recognize_id, afi_price, afi_old_price_currency, meta_link, afi_old_price, meta_published, meta_admin_note, afi_id, brand_id, category_id, retailer_id, afi_promotion, afi_stock, search_vector, original_category_id, search_vector_pl, title_pl, description_pl, owner_id
Index Cond: (products_product.category_id = 43)
Buffers: shared hit=2532 read=19856
Planning time: 0.093 ms
Execution time: 179665.121 ms
RDS:CPU使用率稳定在20-30%的水平,DB连接2-40,可用内存50%(3GB),写入IOPS 1-10,读取IOPS 650-750,可用存储100GB。
什么会引起这种差异?我们还能检查什么?
答案 0 :(得分:1)
请参考https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html
AWS对iops设置上限取决于您的存储类型。如果是gp2,则每GB存储可获得3 iops。如果索引是具有12M条记录的int类型,则索引大小可能为150MB。在700 iops的情况下,即使没有其他会话在运行,也需要一段时间。如果让其他会话占用了iops,则需要buffer_io等待。