TIMESTAMP / DATETIME:计算按月分组的行数

时间:2019-04-26 12:16:17

标签: sql google-bigquery

我正在使用包含有关出租车行程数据的数据集。

这些是我用来创建表的数据的json:

  1. 2009

  2. 2010

  3. 2011

  4. 2012

这些在此google drive上也可用

您可以忽略供应商查​​找数据集,因为它只是针对相同数据具有的不同名称的查找。

我的意图是产生这样的结果:

| Line  | frequency     | month     | year  |
|------ |-----------    |-------    |------ |
| 1     |    20         | 1         | 2009  |
| 2     |    35         | 2         | 2009  |
| 3     |    90         | 3         | 2009  |
| 4     |    24         | 4         | 2009  |
| 5     |    12         | 5         | 2009  |

我尝试过的查询看起来像这样:

SELECT COUNT(payment_type) as frequency,
       month,
       year
FROM
(
SELECT NYCTaxiTrips.pickup_datetime as pickup_datetime,
       paymentlookup.string_field_0 as payment_type , 
       EXTRACT( MONTH FROM NYCTaxiTrips.pickup_datetime) as month,
       EXTRACT( YEAR FROM NYCTaxiTrips.pickup_datetime) as year
FROM `datasprintsteste.datasets.PaymentLookup` as paymentlookup
INNER JOIN  
(
SELECT payment_type, pickup_datetime FROM `datasprintsteste.datasets.NYCTaxiTrips2009` 
UNION ALL
SELECT payment_type, pickup_datetime FROM `datasprintsteste.datasets.NYCTaxiTrips2010` 
UNION ALL
SELECT payment_type, pickup_datetime FROM `datasprintsteste.datasets.NYCTaxiTrips2011` 
UNION ALL
SELECT payment_type, pickup_datetime FROM `datasprintsteste.datasets.NYCTaxiTrips2012` 
) AS NYCTaxiTrips
ON paymentlookup.string_field_0 = NYCTaxiTrips.payment_type
)
WHERE payment_type = 'Cash'
GROUP BY month, year

但这是他们给的结果:

| Line  | frequency     | month     | year  |
|------ |-----------    |-------    |------ |
| 1     |  1389172      | 1         | 2009  |

我尝试不按年份分组,但是产生错误,我很确定这是语法。

我该如何查询我想要的查询?

1 个答案:

答案 0 :(得分:0)

这是您的SQL代码示例,具有较小的数据样本供您播放和测试

WITH `datasprintsteste.datasets.NYCTaxiTrips2009` AS (
SELECT 'cach' AS payment_type, TIMESTAMP('2009-01-01 02:18:18.000') as pickup_datetime UNION ALL
SELECT 'cach' AS payment_type, TIMESTAMP('2009-02-01 02:18:18.000') as pickup_datetime
), 
`datasprintsteste.datasets.NYCTaxiTrips2010` AS (
SELECT 'cach' AS payment_type, TIMESTAMP('2010-03-01 02:18:18.000') as pickup_datetime UNION ALL
SELECT 'cach' AS payment_type, TIMESTAMP('2010-04-01 02:18:18.000') as pickup_datetime
), 
`datasprintsteste.datasets.NYCTaxiTrips2011` AS (
SELECT 'cach' AS payment_type, TIMESTAMP('2011-03-01 02:18:18.000') as pickup_datetime UNION ALL
SELECT 'cach' AS payment_type, TIMESTAMP('2011-01-01 02:18:18.000') as pickup_datetime
),
`datasprintsteste.datasets.NYCTaxiTrips2012` AS (
SELECT 'cach' AS payment_type, TIMESTAMP('2012-05-01 02:18:18.000') as pickup_datetime UNION ALL
SELECT 'cach' AS payment_type, TIMESTAMP('2012-01-01 02:18:18.000') as pickup_datetime
),
`all` AS (
SELECT * FROM `datasprintsteste.datasets.NYCTaxiTrips2009` UNION ALL
SELECT * FROM `datasprintsteste.datasets.NYCTaxiTrips2010` UNION ALL
SELECT * FROM `datasprintsteste.datasets.NYCTaxiTrips2011` UNION ALL
SELECT * FROM `datasprintsteste.datasets.NYCTaxiTrips2012` 
)

SELECT COUNT(payment_type) as frequency,
       EXTRACT( MONTH FROM pickup_datetime) as month,
       EXTRACT( YEAR FROM pickup_datetime) as year
FROM `all`
GROUP BY month, year
ORDER BY year DESC, month DESC

这产生了按月细分的预期结果(我不仅在数据样本中输入了1月的数据)

enter image description here