找到最大行进距离

时间:2017-03-13 14:19:02

标签: sql google-bigquery

我有一组每辆车行驶的GPS点数。我试图检索每次旅行中车辆行驶的最大距离。

数据:

    VehicleId       TripId          Latitude            Longitude
    121             131             33.645              -84.424
    121             131             33.452              -84.409
    121             131             33.635              -84.424
    121             131             35.717              -85.121
    121             131             35.111              -85.111

从上面的数据集中,我的结果集应该是这样的,其中VehicleId和TripId的每个组合的startlat和startlong应该是相同的,而 EndLat和EndLong应该不断变化,以便我可以找出每个车辆从起点行驶的最大距离。

    VehicleId       TripId          StartLat            StartLong       EndLat          EndLong
    121             131             33.645              -84.424         33.645              -84.424
    121             131             33.645              -84.424         33.452              -84.409
    121             131             33.645              -84.424         33.635              -84.424
    121             131             33.645              -84.424         35.717              -85.121
    121             131             33.645              -84.424         35.111              -85.111

我尝试使用以下查询,但是我收到错误"不支持引用其他表的相关子查询,除非它们可以解相关, 例如通过将它们转换为有效的JOIN"任何帮助,将不胜感激。我尝试了下面的查询,它适用于特定的VehicleId和TripId,但我无法做到 为所有组合推广它。

    SELECT
      a.VehicleId,
      a.Tripid,
      a.Latitude AS StartLat,
      a.Longitude AS StartLong,
      b.Latitude AS EndLat,
      b.Longitude AS EndLong,
      a.DateTime
    FROM
      `Vehicles` AS a
    JOIN
      `Vehicles` AS b
    ON
      a.VehicleId = b.VehicleId
      AND a.Tripid = b.Tripid
    WHERE
      a.VehicleId = 550340912
      AND a.Tripid = 18006167 AND
      a.DateTime IN (
      SELECT
        MIN(DateTime)
      FROM
        `Vehicles`
      WHERE
        VehicleId = 550340912
        AND Tripid = 18006167)

1 个答案:

答案 0 :(得分:1)

与纬度/经度对相关的行进距离有些模糊,但我将假设该解决方案的Haversine距离。这是完整的查询,包括设置,建立在a previous SO post about Haversine distance的答案之上。

我们的想法是将每个行程的起点和终点与车辆ID相关联(创建所有条目的数组),然后在阵列上使用子查询来选择距离最远的条目。如果您想要其他指标,可以将其替换为我使用的HAVERSINE函数。

#standardSQL
CREATE TEMP FUNCTION RADIANS(x FLOAT64) AS (
  ACOS(-1) * x / 180
);
CREATE TEMP FUNCTION RADIANS_TO_KM(x FLOAT64) AS (
  111.045 * 180 * x / ACOS(-1)
);
CREATE TEMP FUNCTION HAVERSINE(lat1 FLOAT64, long1 FLOAT64,
                               lat2 FLOAT64, long2 FLOAT64) AS (
  RADIANS_TO_KM(
    ACOS(COS(RADIANS(lat1)) * COS(RADIANS(lat2)) *
         COS(RADIANS(long1) - RADIANS(long2)) +
         SIN(RADIANS(lat1)) * SIN(RADIANS(lat2))))
);

WITH Vehicles AS (
 SELECT 121 AS VehicleId, 131 AS TripId, 33.645 AS Latitude, -84.424 AS Longitude, DATETIME "2017-03-12 12:00:00" AS DateTime UNION ALL
 SELECT 121, 131, 33.452, -84.409, DATETIME "2017-03-12 12:01:00" UNION ALL
 SELECT 121, 131, 33.635, -84.424, DATETIME "2017-03-12 12:01:32" UNION ALL
 SELECT 121, 131, 35.717, -85.121, DATETIME "2017-03-12 13:00:56" UNION ALL
 SELECT 121, 131, 35.111, -85.111, DATETIME "2017-03-12 20:30:47"
)
SELECT
  (SELECT vehicle_and_distance
   FROM UNNEST(vehicles_and_distances) AS vehicle_and_distance
   ORDER BY vehicle_and_distance.distance DESC LIMIT 1).*
FROM (
  SELECT
    ARRAY_AGG(
      STRUCT(VehicleId,
             HAVERSINE(start_location.Latitude, start_location.Longitude,
                       end_location.Latitude, end_location.Longitude) AS distance)
    ) AS vehicles_and_distances
  FROM (
    SELECT
      VehicleId,
      TripId,
      ARRAY_AGG(STRUCT(Latitude, Longitude)
                ORDER BY DateTime ASC LIMIT 1)[OFFSET(0)] AS start_location,
      ARRAY_AGG(STRUCT(Latitude, Longitude)
                ORDER BY DateTime DESC LIMIT 1)[OFFSET(0)] AS end_location
    FROM Vehicles
    GROUP BY
      VehicleId,
      TripId
  )
  GROUP BY TripId
);

编辑:为了完整起见,考虑沿着路线行进的总距离,而不仅仅是起点和终点之间的直线距离,这很有趣。这是另一个查询,通过查看连续的点对来计算沿着路线行进的Haversine距离的总和:

#standardSQL
CREATE TEMP FUNCTION RADIANS(x FLOAT64) AS (
  ACOS(-1) * x / 180
);
CREATE TEMP FUNCTION RADIANS_TO_KM(x FLOAT64) AS (
  111.045 * 180 * x / ACOS(-1)
);
CREATE TEMP FUNCTION HAVERSINE(lat1 FLOAT64, long1 FLOAT64,
                               lat2 FLOAT64, long2 FLOAT64) AS (
  RADIANS_TO_KM(
    ACOS(COS(RADIANS(lat1)) * COS(RADIANS(lat2)) *
         COS(RADIANS(long1) - RADIANS(long2)) +
         SIN(RADIANS(lat1)) * SIN(RADIANS(lat2))))
);

WITH Vehicles AS (
 SELECT 121 AS VehicleId, 131 AS TripId, 33.645 AS Latitude, -84.424 AS Longitude, DATETIME "2017-03-12 12:00:00" AS DateTime UNION ALL
 SELECT 121, 131, 33.452, -84.409, DATETIME "2017-03-12 12:01:00" UNION ALL
 SELECT 121, 131, 33.635, -84.424, DATETIME "2017-03-12 12:01:32" UNION ALL
 SELECT 121, 131, 35.717, -85.121, DATETIME "2017-03-12 13:00:56" UNION ALL
 SELECT 121, 131, 35.111, -85.111, DATETIME "2017-03-12 20:30:47"
)
SELECT
  TripId,
  vehicle_and_distance.*
FROM (
  SELECT
    TripId,
    ARRAY_AGG(STRUCT(VehicleId, total_distance)
              ORDER BY total_distance DESC)[OFFSET(0)] AS vehicle_and_distance
  FROM (
    SELECT
      VehicleId,
      TripId,
      (SELECT
         SUM(HAVERSINE(
               Latitude, Longitude,
               vehicle_locations[OFFSET(off - 1)].Latitude,
               vehicle_locations[OFFSET(off - 1)].Longitude))
       FROM UNNEST(vehicle_locations) WITH OFFSET off
       WHERE off > 0) AS total_distance
    FROM (
      SELECT
        VehicleId,
        TripId,
        ARRAY_AGG(STRUCT(Latitude, Longitude)
                  ORDER BY DateTime ASC) AS vehicle_locations
      FROM Vehicles
      GROUP BY
        VehicleId,
        TripId
    )
  )
  GROUP BY TripId
);
相关问题