为什么sum(bigint)比PostgreSQL v10中的sum(integer)显着快?

时间:2018-11-12 14:17:17

标签: postgresql

create table t_num_type(
   n_bigint bigint,
   n_numeric numeric,
   n_int int
);

insert into t_num_type
select generate_series(1,10000000),
       generate_series(1,10000000),
       generate_series(1,10000000);

1»n_bigint

explain (analyze,buffers,format text)
select sum(n_bigint) from t_num_type;
Finalize Aggregate  (cost=116778.56..116778.57 rows=1 width=32) (actual time=1221.663..1221.664 rows=1 loops=1)
  Buffers: shared hit=23090
  ->  Gather  (cost=116778.34..116778.55 rows=2 width=32) (actual time=1221.592..1221.643 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=23090
        ->  Partial Aggregate  (cost=115778.34..115778.35 rows=1 width=32) (actual time=1217.558..1217.559 rows=1 loops=3)
              Buffers: shared hit=63695
              ->  Parallel Seq Scan on t_num_type  (cost=0.00..105361.67 rows=4166667 width=8) (actual time=0.021..747.748 rows=3333333 loops=3)
                    Buffers: shared hit=63695
Planning time: 0.265 ms
Execution time: 1237.360 ms

2»数字

explain (analyze,buffers,format text)
select sum(n_numeric) from t_num_type;
Finalize Aggregate  (cost=116778.56..116778.57 rows=1 width=32) (actual time=1576.562..1576.562 rows=1 loops=1)
  Buffers: shared hit=22108
  ->  Gather  (cost=116778.34..116778.55 rows=2 width=32) (actual time=1576.502..1576.544 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=22108
        ->  Partial Aggregate  (cost=115778.34..115778.35 rows=1 width=32) (actual time=1572.446..1572.446 rows=1 loops=3)
              Buffers: shared hit=63695
              ->  Parallel Seq Scan on t_num_type  (cost=0.00..105361.67 rows=4166667 width=6) (actual time=0.028..781.808 rows=3333333 loops=3)
                    Buffers: shared hit=63695
Planning time: 0.157 ms
Execution time: 1592.559 ms

3»n_int

explain (analyze,buffers,format text)
select sum(n_int) from t_num_type;
Finalize Aggregate  (cost=116778.55..116778.56 rows=1 width=8) (actual time=1247.065..1247.065 rows=1 loops=1)
  Buffers: shared hit=23367
  ->  Gather  (cost=116778.33..116778.54 rows=2 width=8) (actual time=1247.006..1247.055 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=23367
        ->  Partial Aggregate  (cost=115778.33..115778.34 rows=1 width=8) (actual time=1242.524..1242.524 rows=1 loops=3)
              Buffers: shared hit=63695
              ->  Parallel Seq Scan on t_num_type  (cost=0.00..105361.67 rows=4166667 width=4) (actual time=0.028..786.940 rows=3333333 loops=3)
                    Buffers: shared hit=63695
Planning time: 0.196 ms
Execution time: 1263.352 ms

pg9.6》

abase=# \timing
Timing is on.
abase=# select sum(n_bigint) from t_num_type;
      sum       
----------------
 50000005000000
(1 row)

Time: 2042.587 ms
abase=# select sum(n_numeric) from t_num_type;
      sum       
----------------
 50000005000000
(1 row)

Time: 1874.880 ms
abase=# select sum(n_int) from t_num_type;
      sum       
----------------
 50000005000000
(1 row)

Time: 1073.567 ms

pg10.4》

 postgres=# select sum(n_bigint) from t_num_type;
          sum       
    ----------------
     50000005000000
    (1 row)

    Time: 871.811 ms
    postgres=# select sum(n_numeric) from t_num_type;
          sum       
    ----------------
     50000005000000
    (1 row)

    Time: 1168.779 ms (00:01.169)
    postgres=# select sum(n_int) from t_num_type;
          sum       
    ----------------
     50000005000000
    (1 row)

    Time: 923.551 ms

经过多次测试,pg10.4的求和效率得到了显着提高,9.6:sum(int)> sum(numeric)> sum(bigint),pg10.4:sum(bigint)> sum(int):> sum(数字)

为什么经过多次测试pg10:sum(bigint)> sum(int)? 这是否意味着更推荐使用bigint类型?

1 个答案:

答案 0 :(得分:0)

首先,您应该重复实验几次以查看差异是否保持相同。缓存和其他影响会导致查询时间出现一定波动。

从长远来看,我希望integerbigint之间的差异可以忽略不计。两种求和操作都应在硬件中实现。

numeric应该慢得多,因为对这些二进制编码的小数位的操作是在数据库引擎的C语言中实现的。

如果bigint求和即使在重复的实验中仍然保持更快,我唯一的解释是元组变形:要进入第三列,PostgreSQL必须处理前两列。