带有集合作为顶点的图形

时间:2013-12-04 02:29:10

标签: graph set ocaml

我有一个小的语法,表示为变体类型term,其字符串是令牌/令牌的一部分( type term )。 给出语法中的表达式,我从表达式中收集所有字符串并将它们打包成集( function vars )。最后,我想用这些集合创建一些图形作为顶点(第48-49行)。

出于某种原因,以这种复杂方式创建的图形不识别包含相同变量的集合,并创建具有相同内容的多个顶点。我真的不明白为什么会这样。

以下是此行为的最小工作示例:

(* demo.ml *)
type term =
  | Var of string
  | List of term list * string option
  | Tuple of term list

module SSet = Set.Make(
  struct
    let compare = String.compare
    type t = string
  end)

let rec vars = function
  | Var v -> SSet.singleton v
  | List (x, tail) ->
    let tl = match tail with
    | None -> SSet.empty 
    | Some var -> SSet.singleton var in
    SSet.union tl (List.fold_left SSet.union SSet.empty (List.map vars x))
  | Tuple x -> List.fold_left SSet.union SSet.empty (List.map vars x)

module Node = struct
  type t = SSet.t
  let compare = SSet.compare
  let equal = SSet.equal
  let hash = Hashtbl.hash
end

module G = Graph.Imperative.Digraph.ConcreteBidirectional(Node)

(* dot output for the graph for illustration purposes *)
module Dot = Graph.Graphviz.Dot(struct
  include G
  let edge_attributes _ = []
  let default_edge_attributes _ = []
  let get_subgraph _ = None
  let vertex_attributes _ = []
  let vertex_name v = Printf.sprintf "{%s}" (String.concat ", " (SSet.elements v))
  let default_vertex_attributes _ = []
  let graph_attributes _ = []
end)

let _ =
  (* creation of two terms *)
  let a, b = List ([Var "a"], Some "b"), Tuple [Var "a"; Var "b"] in
  (* get strings from terms packed into sets *)
  let avars, bvars = vars a, vars b in
  let g = G.create () in
  G.add_edge g avars bvars;
  Printf.printf "The content is the same: [%s] [%s]\n"
    (String.concat ", " (SSet.elements avars))
    (String.concat ", " (SSet.elements bvars));
  Printf.printf "compare/equal output: %d %b\n"
    (SSet.compare avars bvars)
    (SSet.equal avars bvars);
  Printf.printf "Hash values are different: %d %d\n"
    (Hashtbl.hash avars) (Hashtbl.hash bvars);
  Dot.fprint_graph Format.str_formatter g;
  Printf.printf "Graph representation:\n%s" (Format.flush_str_formatter ())

要进行编译,请键入ocamlc -c -I +ocamlgraph demo.ml; ocamlc -I +ocamlgraph graph.cma demo.cmo。执行程序时,您将获得此输出:

The content is the same: [a, b] [a, b]
compare/equal output: 0 true
Hash values are different: 814436103 1017954833
Graph representation:
digraph G {
  {a, b};
  {a, b};


  {a, b} -> {a, b};
  {a, b} -> {a, b};

  }

总而言之,我很好奇为什么集合中存在不相等的哈希值,并且在图表中创建了两个相同的顶点,尽管这些集合在所有其他方法中都是相同的。

2 个答案:

答案 0 :(得分:4)

我怀疑一般的答案是OCaml的内置散列是基于值的物理属性,而集合相等是一个更抽象的概念。如果将集合表示为有序二进制树,则有许多树代表相同的集合(众所周知)。这些将与集合相同,但可能很好地散列到不同的值。

如果您希望散列适用于集合,则可能需要提供自己的函数。

答案 1 :(得分:1)

正如杰弗里指出的那样,问题似乎是哈希函数的定义,它是Node模块的一部分。

将其更改为let hash x = Hashtbl.hash (SSet.elements x)解决了问题。