我有一个大表(6+百万行),我想添加一个自动递增的整数列sid
,其中sid
是基于{{1 }}。换句话说,基于ORDER BY inserted_at ASC
的最旧记录将被设置为inserted_at
,而最新记录将是总记录数。关于如何处理此问题的任何提示?
答案 0 :(得分:2)
添加一个sid
列和UPDATE SET ... FROM ... WHERE
:
UPDATE test
SET sid = t.rownum
FROM (SELECT id, row_number() OVER (ORDER BY inserted_at ASC) as rownum
FROM test) t
WHERE test.id = t.id
请注意,这依赖于主键id
。
(如果您的表还没有主键,则必须先创建一个。)
例如,
-- create test table
DROP TABLE IF EXISTS test;
CREATE TABLE test (
id int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY
, foo text
, inserted_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO test (foo, inserted_at) VALUES
('XYZ', '2019-02-14 00:00:00-00')
, ('DEF', '2010-02-14 00:00:00-00')
, ('ABC', '2000-02-14 00:00:00-00');
-- +----+-----+------------------------+
-- | id | foo | inserted_at |
-- +----+-----+------------------------+
-- | 1 | XYZ | 2019-02-13 19:00:00-05 |
-- | 2 | DEF | 2010-02-13 19:00:00-05 |
-- | 3 | ABC | 2000-02-13 19:00:00-05 |
-- +----+-----+------------------------+
ALTER TABLE test ADD COLUMN sid INT;
UPDATE test
SET sid = t.rownum
FROM (SELECT id, row_number() OVER (ORDER BY inserted_at ASC) as rownum
FROM test) t
WHERE test.id = t.id
收益
+----+-----+------------------------+-----+
| id | foo | inserted_at | sid |
+----+-----+------------------------+-----+
| 3 | ABC | 2000-02-13 19:00:00-05 | 1 |
| 2 | DEF | 2010-02-13 19:00:00-05 | 2 |
| 1 | XYZ | 2019-02-13 19:00:00-05 | 3 |
+----+-----+------------------------+-----+
最后,将sid
序列为(or, better, an IDENTITY column):
ALTER TABLE test ALTER COLUMN sid SET NOT NULL;
-- IDENTITY fixes certain issue which may arise with SERIAL
ALTER TABLE test ALTER COLUMN sid ADD GENERATED BY DEFAULT AS IDENTITY;
-- ALTER TABLE test ALTER COLUMN sid SERIAL;