从节点列表和边缘列表中查找连通性

时间:2016-06-07 22:07:55

标签: python graph-theory hierarchical-clustering

(TL; DR)

  

给定一组定义为点字典的节点,以及定义为关键元组字典的边集合,python中是否有一个算法可以轻松找到连续的段?

(上下文:)

我有两个模拟道路网段的文件。

Nodes.txt

1;-5226574;-3118329 # latitude and longitude as integers (multiplied by 1e5)
2;-5226702;-3118330
3;-5226750;-3118332
4;-5226793;-3118338
...

Edges.txt

1;1;2
2;3;5
3;23;345
4;23;11
...

每个边表示节点索引在两个节点之间的(索引)链接。

生成的网络的子部分如下所示:

enter image description here

正如您所看到的,绝大多数节点都是简单的"节点,意味着它位于路段的中间,属于两个边缘。另一方面,有特殊的"节点,意味着它们代表分叉或十字路口,因为它属于三个或更多边缘。

目前,我有一系列孤立的路段,但我希望将两个特殊节点之间的每个路段定义为节点的序列。它使得绘制,测量距离等所有事情都快得多,并且它还允许我将每个节点序列表示为超级边缘"链接两个特殊节点,从而简化拓扑。

我可以轻易想象一些蛮力的方法来做到这一点,但节点的数量相对较高,而且我没有理论背景为我指明了解决这个问题的方法。

更新:

我的原始数据有created a gist。每条线代表一条道路作为一系列点(纬度,经度),道路重叠很多。我的目标是从这个"道路列表中生成节点和链接的词典。在文件中。

您可以使用以下python脚本访问内容:

with open('RawRoads.txt') as roadsFile:
    for line in roadsFile.readlines():
        road = [tuple(map(lambda x:int(float(x)*1e5), coord.split(','))) for coord in line.strip().split(' ')]

或者:

import urllib

url = "https://gist.githubusercontent.com/heltonbiker/ca043f8ee191db5bf8349b1b7af0394c/raw/RawRoads.txt"

lines = urllib.urlopen(url).readlines() 
for line in lines:
    # you got the idea

1 个答案:

答案 0 :(得分:2)

让我们不要过于野蛮。我认为我们可以通过构建一个简单的列表列表来做得很好,这样 edge [i] 是一个最多包含三个元素的列表,节点 i 的节点是连接的。如果节点编号密集且从0开始,则可以使用列表;如果他们不是,我会使用目录。

我以

的形式从edges.txt构建一个列表

edge_list = [(1,2),  (2,3),  (3,5),  (2,23), (23345), (23,11),...]

现在构建双向边参考目录:

接下来,识别特殊节点,具有2以外顺序的节点:交叉点和地图边缘。然后我们选择一个并构建一个段,直到我们击中另一个段。

# Dictionary of edges, indexed in both directions by node number.
edge = {}

# Ingest the data and build teh dictionary
with open("edges.txt") as efile:
    for line in efile:
        eid, src, dst = line.strip().split(';')
        src = int(src)
        dst = int(dst)

        for key, val in [(src, dst), (dst, src)] :
            if key in edge:
                edge[key].append(val)
            else:
                edge[key] = ([val])
print "edge dictionary has entries:", len(edge)

# Identify endpoint nodes: order other than 2
end_ct = 0
print "Endpoint Nodes"
endpoint = []
for src, dst in edge.iteritems():
    if len(dst) != 2:
        print len(dst), src, dst
        endpoint.append(src)
        end_ct += len(dst)
print end_ct, "road ends"

atlas = []    # List of roads, each a list of nodes

# Build roads between the identified endpoints
# Pick the first endpoint in the remaining list.
# Move to the first-listed adjacent node.
# Keep going until we hit another node on the endpoint list.
while len(endpoint) > 0:
    here = endpoint[0]
#   print "Road starting at", here, edge[here]

    # Pick a first step and consume the edge
    next = edge[here].pop(0)
    edge[next].remove(here)
    road = [here, next]

    # If that's the last connection to the node, remove that node from the endpoints list.
    if len(edge[here]) == 0:
        del endpoint[0]
        del edge[here]
    # Special case for a one-segment road; endpoint entry of "next" is removed after the loop
    if len(edge[next]) == 0:
        del edge[next]

    # Consume edges until we reach another endpoint.
    debug = False
    while next not in endpoint:
        here = next
        next = edge[here].pop(0)
        edge[next].remove(here)
        road.append(next)
        if len(edge[next]) == 0:
            del edge[next]
#           print "removing node", next

    if next not in edge:
        endpoint.remove(next)
#       print "removing endpoint", next

    print "\nRoad from", road[0], "to", road[-1], ':\n\t', road
    atlas.append(road)

print "\n", len(atlas), "roads built"
# print "edge dictionary still has entries:", len(edge)

从OP编辑:

它有效,它快速而正确,我发现它值得一个可视化:

import matplotlib.pyplot as plt

for road in atlas:
    path = [nodesdict[i] for i in road]
    lons, lats = zip(*path)
    plt.plot(lons, lats)

plt.grid()
plt.axis('equal')
plt.show()