在同一个JVM

时间:2016-01-19 14:30:16

标签: java apache-spark jvm

根据我的last question,我必须为我独特的JVM定义Multiple SparkContext。

我是通过下一种方式(使用Java)完成的:

SparkConf conf = new SparkConf();
conf.setAppName("Spark MultipleContest Test");
conf.set("spark.driver.allowMultipleContexts", "true");
conf.setMaster("local");

之后我创建了下一个源代码:

SparkContext sc = new SparkContext(conf);
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

以及稍后的代码:

JavaSparkContext ctx = new JavaSparkContext(conf);
JavaRDD<Row> testRDD = ctx.parallelize(AllList);

执行代码后,我收到了下一条错误消息:

16/01/19 15:21:08 WARN SparkContext: Multiple running SparkContexts detected in the same JVM!
org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true. The currently running SparkContext was created at:
org.apache.spark.SparkContext.<init>(SparkContext.scala:81)
test.MLlib.BinarryClassification.main(BinaryClassification.java:41)
    at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning$1.apply(SparkContext.scala:2083)
    at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning$1.apply(SparkContext.scala:2065)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.SparkContext$.assertNoOtherContextIsRunning(SparkContext.scala:2065)
    at org.apache.spark.SparkContext$.setActiveContext(SparkContext.scala:2151)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:2023)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
    at test.MLlib.BinarryClassification.main(BinaryClassification.java:105)

数字41105是行,其中两个对象都是在Java代码中定义的。我的问题是,如果我已经使用set - 方法,是否可以在同一个JVM上执行多个SparkContext以及如何执行它?

3 个答案:

答案 0 :(得分:13)

您确定需要将JavaSparkContext作为单独的上下文吗?你提到的上一个问题并没有这么说。如果您已经有Spark Context,可以从中创建一个新的JavaSparkContext,而不是创建一个单独的上下文:

SparkConf conf = new SparkConf();
conf.setAppName("Spark MultipleContest Test");
conf.set("spark.driver.allowMultipleContexts", "true");
conf.setMaster("local");

SparkContext sc = new SparkContext(conf);
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

//Create a Java Context which is the same as the scala one under the hood
JavaSparkContext.fromSparkContext(sc)

答案 1 :(得分:5)

SparkContext默认运行,所以你必须停止这个上下文: sc.stop 然后你可以继续没有任何pb

答案 2 :(得分:0)

您应该在SparkContext上使用builder方法,而不是使用SparkSession,它更严格地实例化spark和SQL上下文,并确保不存在上下文冲突。

import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().appName("demo").getOrCreate()
相关问题