从大表中删除特定行

时间:2016-11-23 09:26:58

标签: postgresql

我需要帮助删除大量具有分区的ID的大量特定行(大约10k ID)。

我运行以下查询来分析只有2个ID的执行计划:

explain analyze delete from scheme.users_daily
where user_id in (  -- user_id is not PK, can be duplicated
5791001,
7779001 
)

实际上,我应该删除大约10k个ID。我想这大约是表中所有数据的1-3%。

并收到以下结果:

"Delete on users_daily  (cost=0.00..8573700.00 rows=3627 width=6) (actual time=10271.578..10271.578 rows=0 loops=1)"
"  ->  Seq Scan on users_daily  (cost=0.00..0.00 rows=1 width=6) (actual time=0.001..0.001 rows=0 loops=1)"
"        Filter: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
"  ->  Index Scan using users_daily_y2013_unq on users_daily_y2013  (cost=0.57..2351216.55 rows=1556 width=6) (actual time=2792.469..2792.469 rows=0 loops=1)"
"        Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
"  ->  Index Scan using users_daily_y2014_unq on users_daily_y2014  (cost=0.56..1745517.74 rows=694 width=6) (actual time=2158.849..2158.849 rows=0 loops=1)"
"        Index Cond: (user_id = ANY ('{5791001,7779001}'::bigint[]))"
"  ->  Index Scan using users_daily_y2015hy1_unq on users_daily_y2015hy1  (cost=0.56..1039034.28 rows=349 width=6) (actual time=1224.623..1224.623 rows=0 loops=1)"
"        Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
"  ->  Index Scan using users_daily_y2015hy2_unq on users_daily_y2015hy2  (cost=0.56..1159715.43 rows=375 width=6) (actual time=1380.513..1380.513 rows=0 loops=1)"
"        Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
"  ->  Index Scan using users_daily_y2016_unq on users_daily_y2016  (cost=0.57..2278216.00 rows=652 width=6) (actual time=2715.106..2715.106 rows=0 loops=1)"
"        Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
"Planning time: 0.364 ms"

然后我检查当前表和分区中的行数:

explain analyze select count(id)
from scheme.users_daily

响应:

"Aggregate  (cost=9735664.99..9735665.00 rows=1 width=8) (actual time=1005691.857..1005691.858 rows=1 loops=1)"
"  ->  Append  (cost=0.00..9032279.19 rows=281354320 width=8) (actual time=1.967..797587.379 rows=268084881 loops=1)"
"        ->  Seq Scan on users_daily  (cost=0.00..0.00 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)"
"        ->  Seq Scan on users_daily_y2013  (cost=0.00..2489084.32 rows=77247432 width=8) (actual time=1.957..134821.451 rows=71962946 loops=1)"
"        ->  Seq Scan on users_daily_y2014  (cost=0.00..1848731.32 rows=57374432 width=8) (actual time=1.256..72915.919 rows=54835860 loops=1)"
"        ->  Seq Scan on users_daily_y2015hy1  (cost=0.00..1094813.48 rows=34156048 width=8) (actual time=0.735..40167.938 rows=32796038 loops=1)"
"        ->  Seq Scan on users_daily_y2015hy2  (cost=0.00..1219911.76 rows=38124076 width=8) (actual time=0.801..43994.913 rows=36669043 loops=1)"
"        ->  Seq Scan on users_daily_y2016  (cost=0.00..2379738.31 rows=74452331 width=8) (actual time=0.009..109012.004 rows=71820994 loops=1)"
"Planning time: 0.688 ms"
"Execution time: 1005691.977 ms"

如何通过从大表中删除大量行来解决此性能问题?

0 个答案:

没有答案