在顶点RDD上键入不匹配

时间:2016-01-05 19:00:20

标签: apache-spark spark-graphx

可以在GraphX顶点中存储多少个属性(属性键:值对)?

  val vertexArray = Array(
    (1L, ("Name", "Alice"), ("age", 28), ("major", "ECE")),
    (2L, ("Name", "John"), ("age", 23), ("major", "History")),
    (3L, ("Name", "Mark"), ("age", 34), ("major", "Education"))
  )

  val edgeArray = Array(
    Edge(1L, 3L, "cousin"),
    Edge(1L, 2L, "spouse")
  )
  val vertexRDD = sc.parallelize(vertexArray)
  val edgeRDD = sc.parallelize(edgeArray)

  val graph = Graph(vertexRDD, edgeRDD)

上面的代码在创建图表时给出了错误。

Error:(28, 21) type mismatch;
 found   : org.apache.spark.rdd.RDD[(Long, (String, String), (String, Int), (String, String))]
 required: org.apache.spark.rdd.RDD[(org.apache.spark.graphx.VertexId, ?)]
    (which expands to)  org.apache.spark.rdd.RDD[(Long, ?)]
Error occurred in an application involving default arguments.
  val graph = Graph(vertexRDD, edgeRDD)
                    ^

另外,vertexId总是必须是Long,还是graphX也支持String vertexId(如果我想使用java UUID)?

2 个答案:

答案 0 :(得分:5)

如错误所示,vertexRDD必须属于RDD[(VertexId, ?)]类型 - 换句话说,它必须是RDD Tuple2,其中第一个元素必须是输入VertexId。在您的示例中,您创建的RDD Tuple4无效。要使其有效,请将最后三个元素包装在Tuple3中,如下所示:

 val vertexArray = Array(
  (1L, (("Name", "Alice"), ("age", 28), ("major", "ECE"))),
  (2L, (("Name", "John"), ("age", 23), ("major", "History"))),
  (3L, (("Name", "Mark"), ("age", 34), ("major", "Education"))))

要回答你的第二个问题,那么是的,VertexId必须是Long:)

答案 1 :(得分:0)

您需要传递第三个参数“ defaultVertexAttr”

val graph = Graph(vertexRDD,edgeRDD,((“ Name”,“”),(“ age”,0),(“ major”,“”)))