SQL:联盟或自我加入

时间:2017-03-25 19:07:35

标签: mysql sql

我有一个简单的表:user(id,date,task)

任务栏包含" download"或者"上传"

我想弄清楚每天执行每项操作的用户数量。

输出:日期,下载的用户数,上传的用户数

我首先遇到了在select的聚合计数函数中使用子查询的问题,所以我认为我应该在这里使用自联接来分解"任务中的数据"列。

我以为我可以为每个案例创建表格,然后结合这些和计数,但我无法完成这个:

SELECT id,date,task as task_download 来自用户 任务='下载'

SELECT id,date,task as task_upload 来自用户 WHERE task =' upload'

6 个答案:

答案 0 :(得分:4)

select  `date`, 
COUNT( distinct CASE WHEN task = 'download' then id end ) 'download', 
COUNT( distinct CASE WHEN task = 'upload' then id end ) 'upload'
from user
group by  `date`

答案 1 :(得分:2)

我会说,既不是也不是。只需这样的查询即可完成任务:

select `date`, 
    count(distinct case when task = 'download' then id else null end) as downloads, 
    count(distinct case when task = 'upload' then id else null end) as uploads
from user
where  task in ('download', 'upload')
group by `date`

假设date是仅包含日期部分而不包含完整时间戳的列,id是用户ID。您可以在聚合函数中使用distinct关键字,这就是我在这里所做的。

要让此查询快速运行,我建议在task,date

上使用索引

但是,如果date包含完整的时间戳(即包括时间部分),则您希望以不同方式进行分组:

select `date`, 
    count(distinct case when task = 'download' then id else null end) as downloads, 
    count(distinct case when task = 'upload' then id else null end) as uploads
from user
where  task in ('download', 'upload')
group by date(`date`)

答案 2 :(得分:1)

您可以使用子查询来执行此操作,例如:

SELECT `date` AS `day`,
(SELECT COUNT(*) FROM activity WHERE date = day AND activity = 'upload') AS upload_count,
(SELECT COUNT(*) FROM activity WHERE date = day AND activity = 'download') AS download_count
FROM activity
GROUP BY date;

这里是SQL Fiddle

答案 3 :(得分:1)

首先按日期和任务统计不同的用户,然后根据日期的每个任务对用户求和。

select date,
       sum(case when task = 'upload' then num_users else 0 end) as "upload",
       sum(case when task = 'download' then num_users else 0 end) as "download"
from  (       
       select   date, task, count(distinct id) num_users
       from     usert
       group by date, task
      ) x
group by date
;

在此处查看:http://rextester.com/ZACFB64945

答案 4 :(得分:1)

如果您需要不同的用户,那么建议使用count(distinct)

SELECT date, 
       COUNT(DISTINCT CASE WHEN task = 'upload' THEN userid END) as uploads,
       COUNT(DISTINCT CASE WHEN task = 'download' THEN userid END) as downloads
FROM user
GROUP BY date
ORDER BY date;

如果您需要不同的操作,则可以执行以下操作:

SELECT date, 
       SUM( (task = 'upload')::int ) as uploads,
       SUM( (task = 'download')::int) as downloads
FROM user
GROUP BY date
ORDER BY date;

这使用方便的Postgres简写来计算布尔表达式。

答案 5 :(得分:0)

我使用条件聚合。

要计算在给定日期至少执行一次上传的用户数量的计数(但仅在该日期为该用户增加一次计数,即使该用户执行了更多操作我们可以使用COUNT(DISTINCT user)表达式。

要计算上传总数,我们可以使用COUNT或SUM。

SELECT DATE(t.date) AS `date`
     , COUNT(DISTINCT IF(t.task='upload'  ,t.user,NULL)) AS cnt_users_who_uploaded
     , COUNT(DISTINCT IF(t.task='download',t.user,NULL)) AS cnt_users_who_downloaded
     , SUM(IF(t.task='upload'  ,1,0))                    AS cnt_uploads
     , SUM(IF(t.task='download',1,0))                    AS cnt_downloads
  FROM user t
 GROUP BY DATE(t.date)
 ORDER BY DATE(t.date)

注意:对于没有行date未显示在表中的日期,这不会返回零计数。