多点之间的最近点

时间:2019-03-26 11:01:16

标签: r

我有两个列表,一个包含一个id,每个id包含一组坐标:

+-------+------+------+
| store | lat  | lon  |
+-------+------+------+
|   123 | 37.2 | 13.5 |
|   456 | 39.1 |  9.1 |
|  789  | 45.4 | 11.0 |
+-------+------+------+

第二个是带有坐标和其他数据的气象站列表:

+----+--------+--------+---------------+----------------+
| id |  lat   |  lon   |     name      |    address     |
+----+--------+--------+---------------+----------------+
|  1 | 44.907 |  8.612 | airport_one   | bond street    |
|  2 | 39.930 | 9.720  | airport_two   | oxford street  |
| 3  | 40.020 | 15.280 | airport_three | ellesmere road |
+----+--------+--------+---------------+----------------+

我想在第一个列表(商店列表)中添加两列,其中包括距离和最近的机场的名称,因此我需要比较每个商店与每个机场的距离并返回最短的距离。

我尝试使用distm函数在for循环中实现此目标,但我肯定会丢失一些东西:

for (val in 1:length(airport_master[,1])){

  n <- distm(store_master[1,3:2], airport_master[val,6:5])
  distances <- append(distances, n)
  store_master$closest_airport <- airport_master$name[val])

}

是否有任何库或更好的方法来实现此结果?

2 个答案:

答案 0 :(得分:2)

我的解决方案使用了库pdist中的函数pdist

### Store 
library(pdist)
dat1 <- cbind('store' = c(123, 456, 789),
              'lat'   = c(37.2, 39.1, 45.4),
              'lon'   = c(13.5, 9.1, 11.0))

dat2 <- cbind('id' = 1:3,
              'lat' = c(44.907, 39.93, 40),
              'lon' = c(8.612, 9.72, 15.28))


dist.mat <- as.matrix(pdist(dat1[, 2:3], dat2[,2:3]))
dat2[apply(dist.mat, 1, which.min), 1] ## Or name 

### Combining the result with the first data set 
data.frame(dat1,
           'ClosestID' = dat2[apply(dist.mat, 1, which.min), 1])

答案 1 :(得分:2)

您可以使用tidyverse软件包按照以下步骤进行操作:

library(tidyverse)

# data

store_master <-
  tibble(
    'store' = c(123, 456, 789),
    'lat'   = c(37.2, 39.1, 45.4),
    'lon'   = c(13.5, 9.1, 11.0)
  )

airport_master <-
  tibble(
    'id' = 1:3,
    'lat' = c(44.907, 39.93, 40),
    'lon' = c(8.612, 9.72, 15.28),
    'name' = c('airport_one', 'airport_two', 'airport_three')
  )

# solution

crossing(
  store = store_master$store,
  id = airport_master$id
) %>%
  left_join(store_master, "store") %>%
  left_join(airport_master, "id", suffix = c("_store", "_airpot")) %>%
  mutate(distance = sqrt((lat_store - lat_airpot)^2 + (lon_store - lon_airpot)^2)) %>%
  group_by(store) %>%
  filter(distance == min(distance))

结果:

  store    id lat_store lon_store lat_airpot lon_airpot name          distance
  <dbl> <int>     <dbl>     <dbl>      <dbl>      <dbl> <chr>            <dbl>
1   123     3      37.2      13.5       40        15.3  airport_three     3.32
2   456     2      39.1       9.1       39.9       9.72 airport_two       1.04
3   789     1      45.4      11         44.9       8.61 airport_one       2.44