在命令行中查找最近的坐标集

时间:2018-04-30 20:56:42

标签: python coordinates geospatial

我正在寻找命令行解决方案,以便从CSV坐标列表中找到最近的点集。

Here这是Excel的答案,但我需要一个不同的解决方案。

我不是为每个点寻找最近的点,而是寻找彼此距离最小的点对。

我想匹配GEO的许多发电厂,所以(python?)命令行工具会很棒。

以下是一个示例数据集:

Chicoasén Dam,16.941064,-93.100828
Tuxpan Oil Power Plant,21.014891,-97.334492
Petacalco Coal Power Plant,17.983575,-102.115252
Angostura Dam,16.401226,-92.778926
Tula Oil Power Plant,20.055825,-99.276857
Carbon II Coal Power Plant,28.467176,-100.698559
Laguna Verde Nuclear Power Plant,19.719095,-96.406347
Carbón I Coal Power Plant,28.485238,-100.69096
Manzanillo I Oil Power Plant,19.027372,-104.319274
Tamazunchale Gas Power Plant,21.311282,-98.756266

该工具应打印" Carbon II"和"碳I",因为这对具有最小距离。

代码片段可以是:

from math import radians, cos, sin, asin, sqrt
import csv

def haversine(lon1, lat1, lon2, lat2):
    # convert decimal degrees to radians
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 

    km = 6371 * c
    return km 

with open('mexico-test.csv', newline='') as csvfile:
    so = csv.reader(csvfile, delimiter=',', quotechar='|')
    data = []
    for row in so:
        data.append(row)

print(haversine(28.467176,-100.698559,28.485238,-100.69096))

1 个答案:

答案 0 :(得分:0)

一种简单的方法是计算所有对,然后找到最小对,其中"大小"一对中的一对被定义为该对中两点之间的距离:

from itertools import combinations

closest = min(combinations(data, 2),
              key=lambda p: haversine(float(p[0][1]), float(p[0][2]), float(p[1][1]), float(p[1][2])))

要获得最小的五个,请使用具有相同密钥的堆。

import heap

pairs = list(combinations(data, 2))
heap.heapify(pairs)
five_smallest = heapq.nsmallest(
    5,
    combinations(data, 2),
    key=lambda p: haversine(float(p[0][1]), float(p[0][2]), float(p[1][1]), float(p[1][2])))