从节点集创建组

时间:2009-10-29 15:07:12

标签: algorithm data-structures graph set

我有一个集合列表(以下示例中的a,b,c,d,e)。每个集合都包含该集合中的节点列表(下面的1-6)。我想知道可能有一个通用的已知算法来实现以下,我只是不知道它。

sets[
 a[1,2,5,6],
 b[1,4,5],
 c[1,2,5],
 d[2,5],
 e[1,6],
]

我想生成一个新结构,一个组列表,每个组都有

  • 出现在多个集合中的所有(子)节点集
  • 对这些节点属于
  • 的原始集的引用

因此上述数据将变为(组无关紧要)。

group1{nodes[2,5],sets[a,c,e]}
group2{nodes[1,2,5],sets[a,c]}
group3{nodes[1,6],sets[a,e]}
group4{nodes[1,5],sets[a,b,c]}

我假设我可以将数据作为数组/对象结构获取并对其进行操作,然后以所需的任何格式将结果吐出。

如果:

,那将是一个加分
  • 所有组至少有2个节点和2个节点。
  • 当一个节点的子集包含在一个形成一个组的较大集合中时,只有较大的集合才能获得一个组:在这个例子中,节点1,2没有自己的一个组,因为它们具有所有集合共同已经出现在group2中。

(这些集合存储在XML中,到目前为止我还设法转换为JSON,但这是无关紧要的。我可以理解程序(伪)代码,但是像XSLT或Scala中的骨架这样的东西可以帮助获取我想,开始了。)

2 个答案:

答案 0 :(得分:1)

  1. 浏览集合列表。对于每组S
    1. 浏览群组列表。对于每组G
      1. 如果S可以是G的成员(即如果G的集合是S的子集),则将S添加到G.
      2. 如果S不能是G的成员但是S ang G的集合包含多个节点,则为该交集创建一个新组并将其添加到列表中。
    2. 为S添加一组自己的组并将其添加到列表中。
    3. 组合具有相同集合的任何组。
  2. 删除任何只有一个成员集的组。

例如,对于您的示例集,在读取a和b之后,组列表为

[1,2,5,6] [a]
[1,5] [a,b]
[1,4,5] [b]

读完c后

[1,2,5,6] [a]
[1,5] [a,b,c]
[1,4,5] [b]
[1,2,5] [a,c]

如果速度有问题,算法会更有效。

答案 1 :(得分:0)

/*
Pseudocode algorithm for creating groups data from a set dataset, further explained in the project documentation. This is based on 
http://stackoverflow.com/questions/1644387/create-groups-from-sets-of-nodes

I am assuming 
- Group is a structure (class) the objects of which contain two lists: a list of sets and a list of nodes (group.nodes). Its constructor accepts a list of nodes and a reference to a Set object
- Set is a list structure (class), the objects (set)  of which contain the nodes of the list in set.nodes
- groups and sets are both list structures that can contain arbitrary objects which can be iterated with foreach(). 
- you can get the objects two lists have in common as a new list with intersection()
- you can count the number of objects in a list with length()
*/

//Create groups, going through the original sets
foreach(sets as set){
    if(groups.nodes.length==0){
        groups.addGroup(new Group(set.nodes, set));
    }
    else{
        foreach (groups as group){
                if(group.nodes.length() == intersection(group.nodes,set.nodes).length()){
                    // the group is a subset of the set, so just add the set as a member the group
                    group.addset(set);
                    if (group.nodes.length() < set.nodes.length()){
                    // if the set has more nodes than the group that already exists, 
                    // create a new group for the nodes of the set, with set as a member of that group
                    groups.addGroup(new Group(set.nodes, set));
                    }
                }

                // If group is not a subset of set, and the intersection of the nodes of the group 
                // and the nodes of the set
                // is greater than one (they have more than one person in common), create a new group with 
                // those nodes they have in common, with set as a member of that group
                else if(group.nodes.length() > intersection(group.nodes,set.nodes).length() 
                    && intersection(group.nodes,set.nodes).length()>1){
                    groups.addGroup(new Group(intersection(group.nodes,set.nodes), set);
                }
        }
    }

}

// Cleanup time!
foreach(groups as group){
    //delete any group with only one member set (for it is not really a group then)
    if (group.sets.length<2){
        groups.remove(group);
    }
    // combine any groups that have the same set of nodes. Is this really needed? 
    foreach(groups2 as group2){
        //if the size of the intersection of the groups is the same size as either of the 
        //groups, then the groups have the same nodes.
        if (intersection(group.nodes,group2.nodes).length == group.nodes.length){
            foreach(group2.sets as set2){
                if(!group.hasset(set)){
                    group.addset(set2);
                }
            }
            groups.remove(group2);
        }
        }

}