Question

鉴于这些表格：

TABLE Stores (
 store_id INT,
 store_name VARCHAR,
 etc
);

TABLE Employees (
 employee_id INT,
 store_id INT,
 employee_name VARCHAR,
 currently_employed BOOLEAN,
 etc
);

我想列出每个商店的15名雇员最长的雇员（比如15名最低employee_id），或者如果有15名currently_employed='t'，则列出商店的所有雇员。我想用join子句来做。

我发现很多人这样做仅为1行，通常是最小或最大（单个最长雇用的员工），但我想基本上合并{{ 1}}和加入内部的ORDER BY。其中一些例子可以在这里找到：

我也找到了做这个商店的不错的例子（我没有，我有大约5000家商店）：

Get top n records for each group of grouped results

我还看到您可以使用LIMIT代替TOP和ORDER BY，但不能使用PostgreSQL。

我认为两个表之间的连接子句不是唯一（或者甚至是最好的方法），如果它可以只通过employees表中的不同LIMIT工作，那么我对其他方法持开放态度。之后可以随时加入。

由于我对SQL很陌生，我想要任何理论背景或其他解释来帮助我理解工作原理。

Answer 1

`row_number()`

每组获得前n行的一般解决方案是使用窗口函数row_number()：

SELECT *
FROM  (
   SELECT *, row_number() OVER (PARTITION BY store_id ORDER BY employee_id) AS rn
   FROM   employees
   WHERE  currently_employed
   ) e
JOIN   stores s USING (store_id)
WHERE  rn <= 15
ORDER  BY store_id, e.rn;

PARTITION BY应使用store_id，保证其唯一（与store_name相对）。
首先确定employees中的行，然后加入stores，这样会更便宜。
要获得15行，请使用row_number()而不是rank()（这将是错误的工具）。只要employee_id是唯一的，就不会显示差异。

`LATERAL`

Postgres 9.3 + 的替代方案，当从大表中检索小选项时，通常与匹配索引尤其结合使用效果更佳。

What is the difference between LATERAL and a subquery in PostgreSQL?

SELECT s.store_name, e.*
FROM   stores s
, LATERAL (
   SELECT *  -- or just needed columns
   FROM   employees
   WHERE  store_id = s.store_id
   AND    currently_employed
   ORDER  BY employee_id
   LIMIT  15
   ) e
-- WHERE ... possibly select only a few stores
ORDER  BY s.store_name, e.store_id, e.employee_id

完美索引将是这样的部分多列索引：

CREATE INDEX ON employees (store_id, employee_id) WHERE  currently_employed

详细信息取决于问题中缺少的详细信息。相关示例：

Create unique constraint with null columns

两个版本都不包括没有当前员工的商店。如果你需要它，有办法解决这个问题......

Answer 2

执行此操作的经典方法是使用window function，例如rank：

SELECT employee_name, store_name
FROM   (SELECT employee_name, store_name, 
        RANK() OVER (PARTITION BY store_name ORDER BY employee_id ASC) AS rk
        FROM   employees e
        JOIN   stores s ON e.store_id = s.store_id) t
WHERE  rk <= 15

限制每组加入的行数（不是1行）

2 个答案:

`row_number()`

`LATERAL`