将连接的节点划分为三个

时间:2016-11-16 11:12:20

标签: python algorithm grouping

我有一组彼此连接的对象(我称之为节点)。每个节点都连接到至少一个其他节点,整个节点集合是一个大blob(没有异常值)。节点数是三的倍数。

我想要做的是将节点划分为三个连接节点的组(组之间没有重叠)。

每个节点都有一个唯一的ID。节点存储在文档中,其中每一行是节点的ID,后跟它所连接的节点。它看起来像这样:

30:240
40:210/240/230/60
50:80/220/230/70/210/190/200
60:240/40/230/220
70:90/200/50/220
80:50/170/190/210
90:200/70/180
100:110/160/190/170/200/180
110:100/180
160:170/100
170:80/160/190/100
180:90/200/110/100
190:50/80/200/170/100
200:90/70/100/50/190/180
210:50/80/40/230
220:50/230/70/60
230:50/210/60/40/220
240:40/30/60

我制作了一些图片来帮助想象它:

nodes, ungrouped

我想将我的节点分成三组,但必须连接它们。对节点进行分组有许多可能的组合。一种可能性是以下图像:

nodes, grouped properly

但是,我正在努力想出一种算法,可以确保每个节点都在一组三个连接的节点中(并且没有重叠)。为了避免重叠,我在制作组时“声明”节点,这是我检查节点是否已经属于另一个组的方式。

我提出了以下代码:

nodes = dict()
with open("input.txt") as f:
    for line in f.readlines():
        split = line.split(":")
        node = int(split[0])
        nodes[node] = []
        for connection in split[1].split("/"):
            nodes[node].append(int(connection.strip()))
groups = dict()
claimed = []
for node in nodes:
    if node in claimed:
        continue
    connections = nodes[node]
    if len(connections) > 1:
        for i in range(len(connections)):
            if node in groups and len(groups[node]) == 2:
                break
            if not connections[i] in claimed:
                if not node in groups:
                    groups[node] = []
                groups[node].append(connections[i])
                claimed.append(connections[i])
                if not node in claimed:
                    claimed.append(node)
f = open("output.txt",'w')
for group in groups:
    line = str(group) + ":"
    for node in groups[group]:
        line += str(node) + "/"
    line = line[:-1] + "\n"
    f.write(line)
f.close()

但是,这会产生以下结果:

160:170/100
80:190
230:50/210
70:90/200
110:180
240:40/30
220:60

可以这样显示:

nodes, grouped wrongly

如您所见,有四组三(蓝色)和三组两(橙色)。这是因为我在创建组时声明节点,导致某些节点不再有一个组的两个无人认领的连接节点。

所以我的问题是:我怎样才能让我的节点声称连接节点的方式使得其他节点仍然可以形成自己的三个组?

或者更基本的是:如何将连接的节点划分为三个连接节点的组,而不重叠?每个节点都需要一组三个。

我的问题更多是关于算法而不是代码。我也对一种完全不同的算法持开放态度。

4 个答案:

答案 0 :(得分:2)

我不确定它是否适用于每个边缘情况,但是如果你能找到通过每个节点一次的图形路径,你可以将该路径分成三个部分。根据图表的大小,您可以使用简单的DFS进行此操作。

编辑: 为了实现这一点,你需要修剪所有尾部(不返回的路径),但这只意味着当你将尾部附近的那些分组时,你需要略微不同地计算节点。

答案 1 :(得分:1)

根据连接的节点创建下面的邻接列表。

90 - > 180,200,70

70 - > 90,200,220

180 - > 90,110,100,200

110 - > 180,100

100 - > 110,180,200,190,160,170

导航应如下所示。

  1. 转到90并分配给组1.将90标记为已完成。
  2. 转到180,因为它是第一个连接的节点。分配到组1并将180标记为已完成。
  3. 转到180列表
  4. 第一个是已经标记的90,所以跳过它。
  5. 转到同一列表中的下一个节点。它是110.分配给组1并将标记110标记为已完成。
  6. 转到110列表。
  7. 组加满后增加组号。

    依旧......

    一旦条件(组数* 3)==满足节点总数,就会出现循环。

答案 2 :(得分:1)

  

data.py

node_string = """
30:240
40:210/240/230/60
50:80/220/230/70/210/190/200
60:240/40/230/220
70:90/200/50/220
80:50/170/190/210
90:200/70/180
100:110/160/190/170/200/180
110:100/180
160:170/100
170:80/160/190/100
180:90/200/110/100
190:50/80/200/170/100
200:90/70/100/50/190/180
210:50/80/40/230
220:50/230/70/60
230:50/210/60/40/220
240:40/30/60
        """

node_string_1 = """
(238,140,141):(208,134,191)/(117,119,222)/(229,134,102)/(167,234,229)
(176,120,113):(150,201,147)/(111,237,177)
(113,171,164):(237,128,198)/(159,172,151)/(202,171,225)/(240,168,211)/(111,218,139)
(208,134,191):(238,140,141)/(229,134,102)
(131,150,110):(212,209,198)/(155,133,217)/(168,140,187)/(173,204,120)/(147,210,106)/(207,149,101)/(228,172,108)
(229,134,102):(208,134,191)/(238,140,141)/(167,234,229)
(212,209,198):(109,129,191)/(155,133,217)/(147,210,106)/(131,150,110)
(105,199,125):(150,201,147)/(175,118,238)/(240,168,211)
(175,118,238):(105,199,125)/(240,168,211)
(114,178,129):(164,233,166)/(227,123,219)/(135,156,161)/(224,183,104)/(228,172,108)
(147,210,106):(212,209,198)/(173,204,120)/(131,150,110)/(143,147,114)
(222,128,188):(135,156,161)/(237,128,198)/(159,172,151)/(111,237,177)/(150,201,147)
(166,218,221):(207,149,101)/(143,114,198)/(159,172,151)/(202,171,225)
(143,147,114):(173,204,120)/(107,184,168)/(147,210,106)
(111,218,139):(132,138,234)/(202,171,225)/(113,171,164)
(117,119,222):(109,129,191)/(155,133,217)/(238,140,141)/(167,234,229)
(107,184,168):(164,233,166)/(173,204,120)/(143,147,114)/(227,123,219)
(224,183,104):(164,233,166)/(114,178,129)/(111,237,177)/(135,156,161)
(228,172,108):(227,123,219)/(173,204,120)/(135,156,161)/(159,172,151)/(207,149,101)/(131,150,110)/(114,178,129)
(227,123,219):(173,204,120)/(164,233,166)/(107,184,168)/(228,172,108)/(114,178,129)
(168,140,187):(155,133,217)/(207,149,101)/(176,204,213)/(143,114,198)/(131,150,110)/(159,166,127)
(207,149,101):(168,140,187)/(159,172,151)/(166,218,221)/(143,114,198)/(131,150,110)/(228,172,108)
(159,172,151):(207,149,101)/(135,156,161)/(237,128,198)/(202,171,225)/(113,171,164)/(166,218,221)/(222,128,188)/(228,172,108)
(143,114,198):(168,140,187)/(207,149,101)/(202,171,225)/(166,218,221)/(184,163,168)/(159,166,127)
(164,233,166):(227,123,219)/(114,178,129)/(107,184,168)/(224,183,104)
(150,201,147):(176,120,113)/(240,168,211)/(237,128,198)/(105,199,125)/(111,237,177)/(222,128,188)
(111,237,177):(135,156,161)/(224,183,104)/(222,128,188)/(176,120,113)/(150,201,147)
(159,166,127):(184,163,168)/(168,140,187)/(176,204,213)/(143,114,198)
(155,133,217):(212,209,198)/(168,140,187)/(167,234,229)/(176,204,213)/(109,129,191)/(117,119,222)/(131,150,110)
(109,129,191):(212,209,198)/(117,119,222)/(155,133,217)
(132,138,234):(184,163,168)/(202,171,225)/(111,218,139)
(240,168,211):(150,201,147)/(237,128,198)/(105,199,125)/(175,118,238)/(113,171,164)
(167,234,229):(155,133,217)/(117,119,222)/(238,140,141)/(176,204,213)/(229,134,102)
(173,204,120):(227,123,219)/(147,210,106)/(143,147,114)/(107,184,168)/(131,150,110)/(228,172,108)
(135,156,161):(114,178,129)/(224,183,104)/(159,172,151)/(111,237,177)/(222,128,188)/(228,172,108)
(237,128,198):(150,201,147)/(159,172,151)/(222,128,188)/(240,168,211)/(113,171,164)
(176,204,213):(155,133,217)/(168,140,187)/(159,166,127)/(167,234,229)
(202,171,225):(132,138,234)/(184,163,168)/(159,172,151)/(113,171,164)/(166,218,221)/(143,114,198)/(111,218,139)
(184,163,168):(143,114,198)/(132,138,234)/(159,166,127)/(202,171,225)
"""
  

node_grouper.py

import itertools
import math
from collections import deque
from ast import literal_eval as totuple 

import data

class NodeGrouper():

    finished = False

    steps = 0

    nodes = {}
    groups = {}
    weights  = {}

    best_group = []
    returned_results_cache = []

    # parse node string 
    # works with two different conventions: 
    # - node names that are numbers
    # - node names that are tuples
    def parse_node(self, string):
        try:
            return totuple(string)
        except:
            return int(string)

    # Get a dictionary from the source string.
    # receives node_string (str): the node string passed in settings
    # returns {160: [170, 100], 240: [40, 30, 60] ... }
    def parse_string(self, node_string):
        nodes = {}
        for line in node_string.split():
            split = line.split(":")
            node = self.parse_node(split[0])
            if node not in nodes:
                nodes[node] = set()
            for connection in split[1].split("/"):
                connection = self.parse_node(connection.strip())
                if connection not in nodes:
                    nodes[connection] = set()
                nodes[node].add(connection)
                nodes[connection].add(node)
        self.nodes = nodes

        for node in self.nodes:
            self.nodes[node] = sorted(self.nodes[node])
        return nodes


    # Get all possible combinations.
    # Build a tuple with every possible 3 node connection.
    # receives node (dict) as created by self.parse_string()
    # returns ((90, 180, 200), (80, 100, 190), (60, 70, 220) ... )
    def possible_combinations(self, nodes):
        possible_links = {}
        for node, links in nodes.items():
            link_possibilities = itertools.combinations(links, self.nodes_per_group - 1)
            for link_possibility in link_possibilities:
                if node not in possible_links:
                    possible_links[node] = []
                possible_links[node] += [link_possibility]
        groups = set()
        for node, possible_groups in possible_links.items():
            for possible_group in possible_groups:
                groups.add(tuple(sorted(possible_group + (node,))))
        return tuple(groups)



    # Handle status messages (if settings['log_status'] == True).
    def log_status(self):
        # log how many steps were needed
        if self.settings['log_status']:
            print 'Error margin (unused nodes): %s' % (len(self.nodes) -(len(self.best_group) * 3))
            print '%s steps' % self.steps
            print self.format_output(self.best_group)


    # Create an output string in the format of the Stack Overflow ticket.
    def format_output(self, group_combination):

        out_dict = {}

        for group in group_combination:
            for node in group:
                links = set([n for n in group if n != node])

                if set(links).issubset(set(self.nodes[node])):
                    if node not in out_dict:
                        out_dict[node] = set()
                    out_dict[node] = links
                    break

        output = ''
        for node, links in out_dict.items():
            output += '%s:' % str(node)
            output += '/'.join([str(link) for link in links])
            output += '\n'

        return output



    # Start with groups with nodes harder to match.
    def sort_possible_groups(self, queried_group):
        gid = self.groups.index(queried_group) + 1

        for queried_group in queried_group:
            if queried_group in self.preferred_nodes:
                self.weights[gid - 1] -= 3

        return self.weights[gid - 1]


    def initial_weights(self):
        weights = []
        groups_with_this_node = {}
        for group_id, group in enumerate(self.groups):
            ordinarity = 0
            for node in group:
                if node not in groups_with_this_node:
                    groups_with_this_node[node] = sum(x.count(node) for x in self.groups)
                ordinarity += groups_with_this_node[node] * 100
            weights += [ordinarity]
        return weights


    def selected_group_available(self, group_1, group_pool):
        for group_2 in group_pool:
            for node in group_1:

                self.steps += 1

                if self.settings['log_status']:

                    # log status each 500000 steps, if specified in the settings
                    if self.steps % self.settings['log_period'] == 0:
                        self.log_status()

                if node in group_2:
                    return False

        return True                                  

    def get_results(self, i = None, weights = None):

        group_length = len(self.groups)
        group_length_iterator = range(group_length)

        groups = deque(sorted(self.groups, key = self.sort_possible_groups))

        output = [groups[i]]
        for j in group_length_iterator:

            if i == j:
                continue

            selected_group = tuple(groups[j])

            if self.selected_group_available(selected_group, output):
                output += [selected_group]


        match = sorted(tuple(output))

        if len(output) > len(self.best_group):
            self.best_group = output


        if len(output) >= self.maximum_groups or \
            (
                self.settings['maximum_steps'] > 0 and \
                self.steps > self.settings['maximum_steps']
            ):
            self.finished = True


        # print len(output)
        if match not in self.returned_results_cache:
            self.returned_results_cache += [match]
            self.preferred_nodes = [n for n in self.nodes]
            for g in match:
                for n in g:
                    if n in self.preferred_nodes:
                        self.preferred_nodes.remove(n)

        return len(output)




    def __init__(self, **settings):

        # save settings in a property
        self.settings = {}
        for key, value in settings.iteritems(): 
            self.settings[key] = value

        self.nodes_per_group = 3
        self.nodes = self.parse_string(self.settings['node_string'])
        self.preferred_nodes = [node for node in self.nodes]
        self.groups = self.possible_combinations(self.nodes)        


        self.groupable_nodes = list(set(sum(self.groups, ())))
        self.groupable_nodes_total = len(self.groupable_nodes)

        self.maximum_groups = int(math.floor(self.groupable_nodes_total / self.nodes_per_group))

        self.weights = self.initial_weights()


    def calculate(self):

        for i in range(len(ng.groups)):
            self.get_results(i)
            if self.finished:
                break


if __name__ == "__main__":

    ng = NodeGrouper(

        # 0 is for unlimitted 
        # (try until all nodes are used)
        maximum_steps = 10000000,

        # Log number of iterations
        log_status = False,
        # Log each x iterations
        log_period = 5000000,

        node_string = data.node_string_3,

    )

    ng.calculate()

    print ng.format_output(ng.best_group)

它以1285000000步骤处理了这个例子:

enter image description here

答案 3 :(得分:0)

更大的问题是您在搜索中将拥有多少个节点。 如果#节点保持低电平,您可以始终关注所有可能的组合。 并且每个可能的解决方案。

但如果你有100个节点,这将变得不可能。那么你应该喜欢为你工作的AI代码。

相关问题