如何在线性时间内计算最小瓶颈生成树?

时间:2014-04-05 02:17:44

标签: algorithm language-agnostic graph-theory graph-algorithm minimum-spanning-tree

在最坏的情况下,我们可以通过使用Kruskal算法找到O(E log * V)中的最小瓶颈生成树。这是因为每个最小生成树都是生成树的最小瓶颈。

但我在this课程中遇到了这个面试问题。

  

即使在最坏的情况下,我们如何在线性时间内找到最小的瓶颈生成树。请注意,我们可以假设在最坏的情况下我们可以计算线性时间内n个键的中位数。

4 个答案:

答案 0 :(得分:6)

查找Minimum Bottleneck Spanning Tree(MBST)的标准算法称为Camerini’s algorithm。它以线性时间运行,如下所示:

 1. Find a median edge weight in graph and partition all edges in to two
    partitions A and B around it. Let A have all edges greater than
    pivot and B has all edges less than or equal to pivot.                
 2. Run DFS or BFS on partition B. If it connected graph then again run
    step 1 on it.        
 3. If partition B wasn't a connected graph then we must have more than
    1 connected components in it. Create a new graph by contracting each
    of these connected components as a single vertex and select edges
    from partition A that connects these components. MBST is given by
    edges in connected components + MBST of new graph.

在伪代码中:

1:  procedure MBST(V, E)
2:      if |E| = 1 then
3:          Return E 
4:      else
5:          A, B ←  Partition E in two halves around median
6:                  A is higher half, B is lower half
7:          F ← Connected components of B
8:          if |F| = 1 and spans all vertices then
9:              Return MBST(V, B)
10:         else
11:             V' ← create one vertex for each connected component in F
12:                     plus vertices missing from F
13:             B' ← Edges from A that connects components in F
14:                     and edges to vertices not in F
15:             Return F ∪ MBST(V', B') 
16:         end if
17:     end if
18: end procedure

实施说明:

  1. Median can be found in O(n)
  2. 第7行可以使用BFS或DFS生成disjoint-set数据结构。
  3. 第13行涉及滤除A中的边,其中每条边都有端点,这些端点位于F中的两个不同的连通分量中,或者一个端点的顶点不在F中,而另一个在F中,或者两者都不在F中。这个测试可以是在O(1)每个边缘的摊销时间内使用有效的不相交集数据结构完成。
  4. 更新:我现在还在此主题上创建了Wikipedia page

答案 1 :(得分:4)

  1. 获取V,| E |的权重中位数边缘。
  2. 查找所有边的值不超过V,并获取子图
    • 如果子图已连接,则V是答案的上限,并减少V,重复步骤1,2。
    • 如果子图未连接,让连接的组件成为节点,并增加V,重复步骤1,2。
  3. 然后你可以在线性时间内解决问题。

    PS:使用DFS判断子图是否已连接。复杂度为O(E / 2)+ O(E / 4)+ O(E / 8)+ ... = O(E)

答案 2 :(得分:2)

我发现其他答案缺乏细节,混乱或明显错误。例如,ShitalShah的回答指出:

  

通过收缩每个连接的组件来创建新图   作为单个顶点,并从连接的分区A中选择边   这些组件

然后在他的伪代码中:

11:             V' ← create one vertex for each connected component in F
12:                     plus vertices missing from F
13:             B' ← Edges from A that connects components in F
14:                     and edges to vertices not in F
在伪代码之前的描述中未提及

“ Fem中缺少的个顶点”或F中不存在的个顶点。如果我们放慢脚步,很快就会发现更多差异:

  

第13行涉及滤除A中每个边都有的边   位于F中两个不同连接组件中的端点   或一个端点不在F中且另一个端点不在F中或两者都不在   在F中。

什么?描述和伪代码正在谈论使用较大的边缘连接不同的连接组件,现在我们突然将它们过滤掉了。

还有更多:指向实际算法的链接返回403禁止。维基百科的文章充满了类似的差异。当每个连接的组件完全退化为一个顶点时,我们如何创建一个超级顶点?收缩后如何处理分区A的平行边,我们应该选择哪一个? $&T ^ $#)*)

我相信,在尝试提供答案时,我们应该假设读者知道我们住的地方。因此,我介绍了工作代码,可能是在网络上唯一的代码。鼓声。

public class MBST<E> {
    private static final NumberFormat formatter = NumberFormat.getNumberInstance(Locale.ENGLISH);

    static {
        formatter.setMinimumFractionDigits(2);
        formatter.setMaximumFractionDigits(2);
    }

    private final UndirectedGraph<E> mbst;

    public MBST(UndirectedGraph<E> g) {
        System.out.printf("Graph:%n%s", g);

        if (g.numEdges() <= 1) {
            mbst = g;
            return;
        }

        // can we calculate mean more efficiently?
        DoubleSummaryStatistics stats = g.edges().stream()
                .collect(summarizingDouble(e -> e.weight));
        System.out.printf("Mean: %s%n", formatter.format(stats.getAverage()));
        Map<Boolean, List<Edge<E>>> edgeMap = g.edges().stream()
                .collect(groupingBy(e -> e.weight < stats.getAverage()));

        List<Edge<E>> f = edgeMap.getOrDefault(true, emptyList());
        if (f.isEmpty()) {
            mbst = g;
            return;
        }

        UndirectedGraph<E> b = new UndirectedGraph<>(f);
        ConnectedComponents<E> cc = new ConnectedComponents<>(b);

        if (cc.size == 1 && b.numVertices() == g.numVertices()) {
            mbst = new MBST<>(b).mbst;
            return;
        }

        Map<Edge<E>, Edge<E>> bPrime = new HashMap<>();
        edgeMap.get(false)
                .forEach(e1 -> {
                    boolean vInB = b.containsVertex(e1.v);
                    boolean wInB = b.containsVertex(e1.w);
                    boolean edgeInB = vInB && wInB;
                    E v = vInB ? cc.id(e1.v) : e1.v;
                    E w = wInB ? cc.id(e1.w) : e1.w;

                    Edge<E> e2 = new Edge<>(v, w);

                    bPrime.compute(e2, (key, val) -> {
                        // same connected component
                        if (edgeInB && v.equals(w)) return null;
                        if (val == null || val.weight > e1.weight) return e1;
                        return val;
                    });
                });
        mbst = new MBST<>(new UndirectedGraph<>(bPrime.values())).mbst;
        for (Edge<E> e : f) mbst.addEdge(e);
    }

    public Iterable<Edge<E>> edges() {
        return mbst.edges();
    }
}

我根据以下图表进行了测试。第一张图片来自Wikipedia文章,第二张图片来自paper

enter image description here

Graph:
Vertices = [A, B, C, D, E, F, G]
Edges = [(A, B), (B, C), (D, E), (E, F), (F, G), (B, D), (C, E), (D, F), (E, G), (A, D), (B, E)]
Mean: 8.00
Graph:
Vertices = [A, B, C, D, E, F, G]
Edges = [(A, B), (C, E), (D, F), (E, G), (A, D), (B, E)]
Mean: 6.17
Graph:
Vertices = [A, B, E, G]
Edges = [(A, B), (E, G), (B, E)]
Mean: 7.00

enter image description here

Graph:
Vertices = [A, B, C, D, E, F, G]
Edges = [(A, B), (B, C), (C, D), (D, E), (F, G), (C, E), (D, F), (E, G), (B, G)]
Mean: 10.67
Graph:
Vertices = [E, G]
Edges = [(E, G)]

答案 3 :(得分:0)

重要的是要注意这里另一个答案中的解决方案代码不正确。它不仅无法通过使用这种时间复杂度严格要求的中值算法来获得线性时间,而且它使用平均值而不是中值并将列表划分为可能导致无限循环或 O( mn) 时间,并且两者都违反了 Camerini 原始算法的假设。非常有限的测试用例不足以证明这样的算法有效,需要大量的测试用例才能认为经验验证足够。

这是 Python 代码,它将使用具有适当时间复杂度的 Camerini 算法解决问题。中位数算法的中位数很长,但这是实现它的唯一方法。提供了 Wikipedia 示例以及使用 graphviz 的可视化功能。假设您正在根据引用 Kruskal 的 O(m log n) 问题讨论无向图。一些不必要的邻接矩阵到边缘列表的转换是 O(n^2) 在这里,但它可以很容易地优化出来。

def adj_mat_to_edge_list(adj_mat):
    #undirected graph needs reflexivity check
    return {i+1:[j+1 for j, y in enumerate(row) if not y is None or not adj_mat[j][i] is None]
            for i, row in enumerate(adj_mat)}
def adj_mat_to_costs(adj_mat):
    return {(i+1, j+1): adj_mat[i][j] for i, row in enumerate(adj_mat)
            for j, y in enumerate(row) if not y is None or not adj_mat[j][i] is None}
def partition5(l, left, right):
    i = left + 1
    while i <= right:
        j = i
        while j > left and l[j-1] > l[j]:
            l[j], l[j-1] = l[j-1], l[j]
            j -= 1
        i += 1
    return (left + right) // 2
def pivot(l, left, right):
    if right - left < 5: return partition5(l, left, right)
    for i in range(left, right+1, 5):
        subRight = i + 4
        if subRight > right: subRight = right
        median5 = partition5(l, i, subRight)
        l[median5], l[left + (i-left) // 5] = l[left + (i-left) // 5], l[median5]
    mid = (right - left) // 10 + left + 1
    return medianofmedians(l, left, left + (right - left) // 5, mid)
def partition(l, left, right, pivotIndex, n):
    pivotValue = l[pivotIndex]
    l[pivotIndex], l[right] = l[right], l[pivotIndex]
    storeIndex = left
    for i in range(left, right):
        if l[i] < pivotValue:
            l[storeIndex], l[i] = l[i], l[storeIndex]
            storeIndex += 1
    storeIndexEq = storeIndex
    for i in range(storeIndex, right):
        if l[i] == pivotValue:
            l[storeIndexEq], l[i] = l[i], l[storeIndexEq]
            storeIndexEq += 1
    l[right], l[storeIndexEq] = l[storeIndexEq], l[right]
    if n < storeIndex: return storeIndex
    if n <= storeIndexEq: return n
    return storeIndexEq
def medianofmedians(l, left, right, n):
    while True:
        if left == right: return left
        pivotIndex = pivot(l, left, right)
        pivotIndex = partition(l, left, right, pivotIndex, n)
        if n == pivotIndex: return n
        elif n < pivotIndex: right = pivotIndex - 1
        else: left = pivotIndex + 1
def bfs(g, s):
    n, L = len(g), {}
    #for i in range(1, n+1): L[i] = 0
    L[s] = None; S = [s]
    while len(S) != 0:
        u = S.pop(0)
        for v in g[u]:
            if not v in L: L[v] = u; S.append(v)
    return L
def mbst(g, costs):
    if len(g) == 2 and len(g[next(iter(g))]) == 1: return g
    l = [(costs[(x, y)], (x, y)) for x in g for y in g[x] if x < y]
    medianofmedians(l, 0, len(l) - 1, len(l) // 2)
    A, B = l[len(l)//2:], l[:len(l)//2]
    Gb = {}
    for _, (x, y) in B:
        if x > y: continue
        if not x in Gb: Gb[x] = []
        if not y in Gb: Gb[y] = []
        Gb[x].append(y)
        Gb[y].append(x)
    F, Fr, S = {}, {}, set(Gb)
    while len(S) != 0:
        C = set(bfs(Gb, next(iter(S))))
        S -= C
        root = next(iter(C))
        F[root] = C
        for x in C: Fr[x] = root
    if len(F) == 1 and len(Fr) == len(g): return mbst(Gb, costs)
    else:
        Ga, ca, mp = {}, {}, {}
        for _, (x, y) in A:
            if x > y or (x in Fr and y in Fr): continue
            if x in Fr:
                skip = (Fr[x], y) in ca
                if not skip or ca[(Fr[x], y)] > costs[(x, y)]:
                    ca[(Fr[x], y)] = ca[(y, Fr[x])] = costs[(x, y)]
                if skip: continue
                mp[(Fr[x], y)] = (x, y); mp[(y, Fr[x])] = (y, x)
                x = Fr[x]
            elif y in Fr:
                skip = (x, Fr[y]) in ca
                if not skip or ca[(x, Fr[y])] > costs[(x, y)]:
                    ca[(x, Fr[y])] = ca[(Fr[y], x)] = costs[(x, y)]
                if skip: continue
                mp[(x, Fr[y])] = (x, y); mp[(Fr[y], x)] = (y, x)
                y = Fr[y]
            else: ca[(x, y)] = ca[(y, x)] = costs[(x, y)]
            if not x in Ga: Ga[x] = []
            if not y in Ga: Ga[y] = []
            Ga[x].append(y)
            Ga[y].append(x)
        res = mbst(Ga, ca)
        finres = {}
        for x in res:
            for y in res[x]:
                if x > y: continue
                if (x, y) in mp: x, y = mp[(x, y)]
                if not x in finres: finres[x] = []
                if not y in finres: finres[y] = []
                finres[x].append(y)
                finres[y].append(x)
        for x in Gb:
            for y in Gb[x]:
                if x > y: continue
                if not x in finres: finres[x] = []
                if not y in finres: finres[y] = []
                finres[x].append(y)
                finres[y].append(x)
        return finres
def camerini(adjmat):
    g = mbst(adj_mat_to_edge_list(riskadjmat), adj_mat_to_costs(riskadjmat))
    return {x-1: y-1 if not y is None else y for x, y in bfs(g, next(iter(g))).items()}
def operation_risk(riskadjmat):
    mst = camerini(riskadjmat)
    return max(riskadjmat[mst[x]][x] for x in mst if not mst[x] is None), mst
def risk_graph(adjmat, sr):
    risk, spantree = sr
    return ("graph {" +
            "label=\"Risk: " + str(risk) + "\"" +
            ";".join(str(i+1) + "--" + str(j+1) + "[label=" + str(col) +
                                (";color=blue" if spantree[i] == j or spantree[j] == i else "") + "]"
                                for i, row in enumerate(adjmat) for j, col in enumerate(row) if i < j and not col is None) + "}")
riskadjmat = [[None, 1, 4, 2, None, None, None, None, None, None],
            [1, None, None, None, 3, None, 9, None, None, None],
            [4, None, None, 13, 14, None, None, 6, 8, None],
            [2, None, 13, None, None, 6, None, None, None, 5],
            [None, 3, 14, None, None, None, 8, None, None, None],
            [None, None, None, 6, None, None, None, None, 14, None],
            [None, 9, None, None, 8, None, None, 10, None, None],
            [None, None, 6, None, None, None, 10, None, 11, None],
            [None, None, 8, None, None, 14, None, 11, None, 12],
            [None, None, None, 5, None, None, None, None, 12, None]]
import graphviz
graphviz.Source(risk_graph(riskadjmat, operation_risk(riskadjmat)))

Wikipedia MBST example