使用一个查询的结果在Postgres中查询另一个

时间:2018-11-24 19:32:41

标签: postgresql

我正在尝试获取表中每一行的列大小。这基本上是这两个查询的组合:

SELECT pg_size_pretty(sum(pg_column_size(COLUMN_NAME))) FROM TABLE_NAME;

SELECT column_name FROM information_schema.columns WHERE table_schema = 'public' AND table_name = 'TABLE_NAME';

我的第一次尝试是执行以下两个查询:

 => SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM TABLE_NAME)  FROM information_schema.columns WHERE table_schema = 'public' AND table_name   = 'TABLE_NAME';
ERROR:  column "columns.column_name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_siz...
               ^
 => SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM TABLE_NAME)  FROM information_schema.columns WHERE table_schema = 'public' AND table_name   = 'TABLE_NAME' GROUP BY column_name;
ERROR:  more than one row returned by a subquery used as an expression

也尝试了以下方法:

SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM TABLE_NAME)  FROM information_schema.columns WHERE table_schema = 'public' AND table_name   = 'TABLE_NAME' GROUP BY 1;

哪个返回:

ERROR:  more than one row returned by a subquery used as an expression

当我添加LIMIT 1时,结果不正确:

SELECT column_name, 
   (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM main_apirequest LIMIT 1)
   FROM information_schema.columns 
   WHERE table_schema = 'public' AND table_name   = 'main_apirequest'
   GROUP BY 1;

它看起来像这样:

   column_name    | pg_size_pretty
------------------+----------------
 api_key_id       | 11 bytes
 id               | 3 bytes
...

应该是这样的情况(由于限制1而不会发生)

=> SELECT pg_size_pretty(sum(pg_column_size(id))) FROM main_apirequest
;
 pg_size_pretty
----------------
 19 MB

1 个答案:

答案 0 :(得分:1)

由于您事先不知道列名,但是想在查询中使用列名,因此必须使用动态sql。这是一个简单的示例:

CREATE TABLE t1 (id INTEGER, txt TEXT);

INSERT INTO t1
SELECT g, random()::TEXT
FROM generate_series(1, 10) g;

然后生成查询的SQL是:

DO $$
DECLARE
        query TEXT;
BEGIN
        SELECT 'SELECT ' || STRING_AGG(FORMAT('sum(pg_column_size(%1$I)) AS %1$s', column_name), ', ') || ' FROM t1'
            INTO query
        FROM information_schema.columns
        WHERE table_schema = 'public'
        AND table_name = 't1';

        RAISE NOTICE '%', query;
END $$

创建的查询为SELECT pg_size_pretty(sum(pg_column_size(id))) AS id, pg_size_pretty(sum(pg_column_size(txt))) AS txt FROM t1

如果您有数百列,则工作方式相同。

现在让它生成并运行查询并返回结果,这实际上取决于您的需求。如果您很高兴将其打印到屏幕上,则可以改成这样的格式:

DO $$
DECLARE
        query TEXT;
        result TEXT;
BEGIN
        SELECT 'SELECT CONCAT_WS(E''\n'', ' || STRING_AGG(FORMAT('''%1$s: '' || pg_size_pretty(sum(pg_column_size(%1$I)))', column_name), ', ') || ') FROM t1'
            INTO query
        FROM information_schema.columns
        WHERE table_schema = 'public'
        AND table_name = 't1';

        EXECUTE query
        INTO result;

        RAISE NOTICE '%', result;
END $$

打印:

id: 40 bytes
txt: 181 bytes

如果相反,您希望返回包含多列的记录,则我不太确定如何处理,因为列数及其名称是未知的。我能想到的最好的办法是将其作为JSON返回,然后只返回一件事,并且在那里将有可变数量的字段,无论使用什么列名:

CREATE OR REPLACE FUNCTION test1(_schema_name TEXT, _table_name TEXT)
        RETURNS JSON AS
$$
DECLARE
        query TEXT;
        result JSON;
BEGIN
        SELECT 'SELECT ROW_TO_JSON(cols) FROM (SELECT ' || STRING_AGG(FORMAT('pg_size_pretty(sum(pg_column_size(%1$I))) AS %1$s', column_name), ', ') || ' FROM t1) AS cols'
            INTO query
        FROM information_schema.columns
        WHERE table_schema = _schema_name
        AND table_name = _table_name;

        EXECUTE query
        INTO result;

        RETURN result;
END
$$
        LANGUAGE plpgsql;

运行它:SELECT test1('public', 't1')

返回:{"id":"40 bytes","txt":"181 bytes"}