最近邻算法

时间:2013-07-05 16:42:28

标签: python python-2.7

好的,所以我对编程很陌生。我正在使用Python 2.7,我的下一个目标是实现最近邻算法的一些轻量级版本(请注意,我不是在谈论k最近邻居)。我已经尝试了很多方法,它们很接近,但我似乎仍然无法确定它。

首先, - 我正在使用数组来表示我的incidens-matrix,这是一个好主意吗?我在这里考虑numpys数组和矩阵。

第二, - 我知道算法(Pseudocode),它非常简单。但我绝对可以使用某种kickstart。我对完整的实现并不感兴趣,但我现在已经存货了,我正在寻求帮助。

也许以上内容并不是要澄清我的问题,但随意问。

提前谢谢。

好的,现在我又试了一次。看来我已经明白了, - 无论如何,这就是我所做的。我对结果非常满意,因为我是新手。但我相信你有一些提示或改进。

import numpy as np
import copy

'''
                        NEAREST NEIGHBOUR ALGORITHM
                        ---------------------------


The algorithm takes two arguments. The first one is an array, with elements
being lists/column-vectors from the given complete incidensmatrix. The second 
argument is an integer which represents the startingnode where 1 is the 
smallest. The program will only make sense, if the triangle inequality is satisfied.
Furthermore, diagonal elements needs to be inf. The pseudocode is listed below:


1. - stand on an arbitrary vertex as current vertex.
2. - find out the shortest edge connecting current vertex and an unvisited vertex V.
3. - set current vertex to V.
4. - mark V as visited.
5. - if all the vertices in domain are visited, then terminate.
6. - Go to step 2.

The sequence of the visited vertices is the output of the algorithm

Remark - infinity is entered as np.inf
'''



def NN(A, start):

    start = start-1 #To compensate for the python index starting at 0.
    n = len(A)
    path = [start]
    costList = []
    tmp = copy.deepcopy(start)
    B = copy.deepcopy(A)

    #This block eliminates the startingnode, by setting it equal to inf.
    for h in range(n):
        B[h][start] = np.inf

    for i in range(n):

        # This block appends the visited nodes to the path, and appends
        # the cost of the path.
        for j in range(n):
            if B[tmp][j] == min(B[tmp]):
                costList.append(B[tmp][j])
                path.append(j)
                tmp = j
                break

        # This block sets the current node to inf, so it can't be visited again.
        for k in range(n):
            B[k][tmp] = np.inf

    # The last term adds the weight of the edge connecting the start - and endnote.
    cost = sum([i for i in costList if i < np.inf]) + A[path[len(path)-2]][start]

    # The last element needs to be popped, because it is equal to inf.
    path.pop(n)

    # Because we want to return to start, we append this node as the last element.
    path.insert(n, start)

    # Prints the path with original indicies.
    path = [i+1 for i in path]

    print "The path is: ", path
    print "The cost is: ", cost
    print
    return ""

'''
If the desired result is to know the path and cost from every startnode,
then initialize the following method:
''' 
def every_node(A):
    for i in range(1, len(A)):
        print NN(A, i)
    return ""

1 个答案:

答案 0 :(得分:3)

您的解决方案很好,但它可以简化,同时提高效率。如果您正在使用numpy数组,通常情况是这些小的内部for循环可以用几行代码替换。结果应该更短并且运行得更快,因为numpy使用编译函数来完成它的工作。可能需要一些时间来习惯这种编程风格 - 而不是循环遍历数组的每个元素,您可以立即对整个数组进行操作。这将有助于阅读这个过程的例子;寻找诸如“如何让XX在numpy中更有效率?”等问题。

以下是NN实施的示例:

import numpy as np
def NN(A, start):
    """Nearest neighbor algorithm.
    A is an NxN array indicating distance between N locations
    start is the index of the starting location
    Returns the path and cost of the found solution
    """
    path = [start]
    cost = 0
    N = A.shape[0]
    mask = np.ones(N, dtype=bool)  # boolean values indicating which 
                                   # locations have not been visited
    mask[start] = False

    for i in range(N-1):
        last = path[-1]
        next_ind = np.argmin(A[last][mask]) # find minimum of remaining locations
        next_loc = np.arange(N)[mask][next_ind] # convert to original location
        path.append(next_loc)
        mask[next_loc] = False
        cost += A[last, next_loc]

    return path, cost

..以下是使用此功能的一个示例:

# Expected order is 0,2,3,1,4
A = np.array([
    [0, 2, 1, 2, 2],
    [2, 0, 2, 1, 1],
    [1, 2, 0, 1, 2],
    [2, 1, 1, 0, 2],
    [2, 1, 2, 2, 0]])
print NN(A,0)