是否有更有效的方式来编写此查询?

时间:2015-04-09 14:31:58

标签: sql database performance postgresql database-design

为了节省我在Postgres中执行的查询数量,我想尝试在单个查询中返回尽可能多的数据。这是一个简单的案例,展示了我天真的解决方案,我希望返回所有资产以及与资产相关的所有属性ID。资产有许多属性,通过

绑定
SELECT assets.id, ARRAY(
    SELECT attributes.id
    FROM attributes
    WHERE attributes.asset_id = assets.id
) as attributes
FROM assets
LIMIT 100;

在Postgres中运行它会返回一个如下所示的数据集:

id      attributes       
3017    "{8948,9386}"

现在,ORM需要单独运行该内部查询,在我看来,通过将其委托给应用程序的不同部分来执行相同任务的效率较低。虽然嵌套查询很糟糕,但至少这种方式我没有进行多次潜在的昂贵的数据库调用。

但这种方法至少存在一个大问题:从assets表中返回第二列大致使时间增加一倍。仅返回id列需要4893毫秒,返回idname需要8819毫秒,返回三列需要10744毫秒...只有每列添加到查询时才会变得更糟。不是一个可扩展的解决方案另外,我想每行检索多个关系,因此每行可能会有多个这样的子查询,这肯定会变得更加昂贵。

编辑:根据请求,我针对此查询运行了EXPLAIN ANALYZE。

包括assets.id

"Seq Scan on assets  (cost=0.00..1804514.69 rows=142204 width=4) (actual time=0.059..2306.204 rows=142178 loops=1)"
"  SubPlan 1"
"    ->  Bitmap Heap Scan on attributes  (cost=4.18..12.64 rows=4 width=4) (actual time=0.009..0.009 rows=0 loops=142178)"
"          Recheck Cond: (asset_id = assets.id)"
"          ->  Bitmap Index Scan on attributes_asset_id_idx  (cost=0.00..4.18 rows=4 width=0) (actual time=0.004..0.004 rows=0 loops=142178)"
"                Index Cond: (asset_id = assets.id)"
"Total runtime: 2674.115 ms"

包括assets.idassets.name

"Seq Scan on assets  (cost=0.00..1804514.69 rows=142204 width=20) (actual time=0.058..2330.947 rows=142178 loops=1)"
"  SubPlan 1"
"    ->  Bitmap Heap Scan on attributes  (cost=4.18..12.64 rows=4 width=4) (actual time=0.009..0.009 rows=0 loops=142178)"
"          Recheck Cond: (asset_id = assets.id)"
"          ->  Bitmap Index Scan on attributes_asset_id_idx  (cost=0.00..4.18 rows=4 width=0) (actual time=0.004..0.004 rows=0 loops=142178)"
"                Index Cond: (asset_id = assets.id)"
"Total runtime: 2693.455 ms"

编辑2 :根据请求,这是相关表格模式的(轻微混淆)版本:

-- Table: ta_main.assets

-- DROP TABLE ta_main.assets;

CREATE TABLE ta_main.assets
(
  id serial NOT NULL,
  name text NOT NULL,
  field1 double precision NOT NULL DEFAULT 0,
  field2 double precision NOT NULL DEFAULT 0,
  field3 smallint NOT NULL DEFAULT 3,
  field4 date,
  field5 json,
  field6 text,
  field7 text,
  field8 text,
  field9 date,
  field10 date,
  field11 date,
  field12 text,
  field13 boolean NOT NULL DEFAULT false,
  field14 boolean NOT NULL DEFAULT false,
  field15 boolean NOT NULL DEFAULT false,
  field16 boolean NOT NULL DEFAULT false,
  field17 boolean NOT NULL DEFAULT false,
  field18 boolean NOT NULL DEFAULT false,
  field19 double precision NOT NULL DEFAULT 0,
  field20 date,
  field21 date,
  field22 date,
  field23 text,
  field24 double precision NOT NULL DEFAULT 0,
  field25 double precision NOT NULL DEFAULT 0,
  field26 integer NOT NULL DEFAULT 0,
  field27 integer NOT NULL DEFAULT 0,
  field28 double precision NOT NULL DEFAULT 0,
  field29 boolean NOT NULL DEFAULT false,
  field30 integer NOT NULL DEFAULT 0,
  field31 double precision NOT NULL DEFAULT 0,
  field32 double precision NOT NULL DEFAULT 0,
  field33 double precision NOT NULL DEFAULT 0,
  field34 double precision NOT NULL DEFAULT 0,
  field35 date,
  field36 integer NOT NULL DEFAULT 0,
  field37 integer NOT NULL DEFAULT 0,
  field38 integer NOT NULL DEFAULT 0,
  field39 integer NOT NULL DEFAULT 0,
  field40 json,
  field41 double precision NOT NULL DEFAULT 0,
  field42 boolean NOT NULL DEFAULT false,
  field43 double precision NOT NULL DEFAULT 0,
  field44 double precision NOT NULL DEFAULT 0,
  field46 double precision NOT NULL DEFAULT 0,
  field47 double precision NOT NULL DEFAULT 0,
  field48 text,
  field49 date,
  field50 text,
  field51 text,
  field52 date,
  field53 double precision NOT NULL DEFAULT 0,
  field54 integer,
  field55 boolean NOT NULL DEFAULT false,
  field56 boolean NOT NULL DEFAULT true,
  created timestamp with time zone NOT NULL DEFAULT now(),
  updated timestamp with time zone,
  _deleted boolean NOT NULL DEFAULT false,
  CONSTRAINT asset_pkey PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE ta_main.assets
  OWNER TO postgres;

-- Index: ta_main.assets_deleted_idx

-- DROP INDEX ta_main.assets_deleted_idx;

CREATE INDEX assets_deleted_idx
  ON ta_main.assets
  USING btree
  (_deleted);


-- Trigger: update_timestamp_assets on ta_main.assets

-- DROP TRIGGER update_timestamp_assets ON ta_main.assets;

CREATE TRIGGER update_timestamp_assets
  BEFORE UPDATE
  ON ta_main.assets
  FOR EACH ROW
  EXECUTE PROCEDURE global.update_timestamp();

资产

  -- Table: ta_main.attributes

  -- DROP TABLE ta_main.attributes;

  CREATE TABLE ta_main.attributes
  (
    id serial NOT NULL,
    asset_id integer NOT NULL,
    related_id integer,
    type smallint NOT NULL,
    description text,
    flag smallint NOT NULL DEFAULT 0,
    value double precision NOT NULL DEFAULT 0,
    nbv double precision NOT NULL DEFAULT 0,
    acc double precision NOT NULL DEFAULT 0,
    eul integer NOT NULL DEFAULT 0,
    quantity double precision NOT NULL DEFAULT 0,
    quantity_extra double precision NOT NULL DEFAULT 0,
    added_by text,
    is_import boolean NOT NULL DEFAULT false,
    is_donated boolean NOT NULL DEFAULT false,
    created timestamp with time zone NOT NULL DEFAULT now(),
    updated timestamp with time zone,
    _deleted boolean NOT NULL DEFAULT false,
    CONSTRAINT adjustment_pkey PRIMARY KEY (id),
    CONSTRAINT asset_fkey FOREIGN KEY (asset_id)
        REFERENCES ta_main.assets (id) MATCH SIMPLE
        ON UPDATE NO ACTION ON DELETE CASCADE,
    CONSTRAINT related_fkey FOREIGN KEY (related_id)
        REFERENCES ta_main.attributes (id) MATCH SIMPLE
        ON UPDATE NO ACTION ON DELETE NO ACTION
  )
  WITH (
    OIDS=FALSE
  );
  ALTER TABLE ta_main.attributes
    OWNER TO postgres;

  -- Index: ta_main.attributes_deleted_idx

  -- DROP INDEX ta_main.attributes_deleted_idx;

  CREATE INDEX attributes_deleted_idx
    ON ta_main.attributes
    USING btree
    (_deleted);

  -- Index: ta_main.fki_asset_fkey

  -- DROP INDEX ta_main.fki_asset_fkey;

  CREATE INDEX fki_asset_fkey
    ON ta_main.attributes
    USING btree
    (asset_id);

  -- Index: ta_main.fki_related_fkey

  -- DROP INDEX ta_main.fki_related_fkey;

  CREATE INDEX fki_related_fkey
    ON ta_main.attributes
    USING btree
    (related_id);


  -- Trigger: update_timestamp_attributes on ta_main.attributes

  -- DROP TRIGGER update_timestamp_attributes ON ta_main.attributes;

  CREATE TRIGGER update_timestamp_attributes
    BEFORE UPDATE
    ON ta_main.attributes
    FOR EACH ROW
    EXECUTE PROCEDURE global.update_timestamp();

1 个答案:

答案 0 :(得分:0)

你可以尝试一下吗?这似乎消除了内部查询。

SELECT assets.id, ARRAY_AGG(attributes.id) FROM assets LEFT OUTER JOIN attributes ON assets.id = attributes.asset_id GROUP BY assets.id