选择每个组的前X条记录或默认

时间:2018-10-29 13:41:53

标签: mysql optimization greatest-n-per-group

我有以下架构:

users:

id email
1  'user.one@test.com'
2  'user.two@test.com'

video_group:

id title
1  'Group 1'
2  'Group 2'

videos:

id group_id rank title
1  1        1    'Group 1 - Video 1'
2  1        2    'Group 1 - Video 2'
3  2        1    'Group 2 - Video 1'

user_video_play_times:

video_id user_id time last_update
2        1       12   01-02-2018
1        1       120  01-01-2018

我需要获取用户在特定组中播放的最后一部视频的timeuser_idvideo_idgroup_id,但是如果没有记录user_video_play_times(对于一组),应返回排名最低的视频。例如:

user_id group_id video_id time
1       1        2        12    -- user.one + group 1
1       2        3        0     -- user one + group 2

这是我到目前为止的查询:

SELECT
   pt.user_id user_id,
   v.id       video_id,
   g.id       group_id,
   pt.time    time
FROM
   videos v
   INNER JOIN video_groups g ON g.id = v.group_id
   LEFT JOIN user_video_play_times pt ON 
      pt.video_id = v.id AND 
      pt.user_id = 1
   LEFT JOIN (
      SELECT 
         g.id AS g_id,
         MAX(pt.last_update) AS pt_last_update
      FROM
         user_video_play_times pt
         INNER JOIN videos v ON v.id = pt.video_id
         INNER JOIN video_groups g ON g.id = v.group_id
      WHERE
         pt.user_id = 1 AND
         g.id IN (1, 2)
      GROUP BY
         g.id
   ) lpt ON lpt.g_id = g.id AND lpt.pt_last_update = pt.last_update
WHERE
   g.id IN (1, 2)
GROUP BY
   g.id

这是可行的,但是...

  1. v.title添加到列选择中会由于某种原因使结果混乱,使所有内容仅返回排名1的视频。
  2. 该查询是否可以优化,或者还有另一种更滑行的方式来实现相同的结果?

对此提供的任何帮助都非常感谢!

数据库提琴here

更新1:

这个问题似乎仅在类型为text的列os时发生。

1 个答案:

答案 0 :(得分:1)

由于您的db<>fiddle适用于MariaDB 10.3版;我假设您有Window Functions个空位。

根据定义的规则,我们可以在group_id的分区上使用Row_number()函数来获取行号值。最新值为last_update的视频的行号为1,依此类推。如果没有播放视频,则排名最低的视频将具有行号= 1。

我们可以将此结果集用作派生表,并仅考虑行号= 1的那些行。

SELECT 
  dt.user_id, 
  dt.group_id, 
  dt.video_id, 
  dt.video_title, 
  dt.time 
FROM 
(
  SELECT
     pt.user_id AS user_id,
     g.id       AS group_id,
     v.id       AS video_id,
     v.title    AS video_title,  
     pt.time    AS time,  
     ROW_NUMBER() OVER(PARTITION BY v.group_id 
                       ORDER BY pt.last_update DESC, 
                                v.`rank` ASC) AS row_num 
  FROM videos AS v
  INNER JOIN video_groups AS g 
    ON g.id = v.group_id AND 
       g.id IN (1,2) 
  LEFT JOIN user_video_play_times AS pt 
    ON pt.video_id = v.id AND 
       pt.user_id = 1 
) AS dt 
WHERE dt.row_num = 1

View on DB Fiddle

结果:

| user_id | group_id | video_id | video_title       | time |
| ------- | -------- | -------- | ----------------- | ---- |
| 1       | 1        | 2        | Group 1 - Video 2 | 12   |
|         | 2        | 3        | Group 2 - Video 1 |      |

PS:请注意RankReserved Keyword,您应该避免使用它作为列/表名。