替换默认情况下导入的库Spark的类路径

时间:2019-04-05 13:20:05

标签: apache-spark gradle dependency-management

我目前正在Spark 2.1.0中从事一个项目,我需要导入一个Spark本身已经依赖的库。特别是,我希望org.roaringbitmap:RoaringBitmap:0.7.42取代org.roaringbitmap:RoaringBitmap:0.5.11org.apache.spark:spark-core_2.11:2.1.0.cloudera1org.apache.spark:spark-sql_2.11:2.1.0.cloudera1都依赖的库)。

我在build.gradle中的依存关系如下

dependencies {
    compile 'org.apache.spark:spark-core_2.11:2.1.0.cloudera1'
    runtime ('org.apache.spark:spark-core_2.11:2.1.0.cloudera1') {
        exclude group: 'org.roaringbitmap'
    }
    compile 'org.apache.spark:spark-sql_2.11:2.1.0.cloudera1'
    runtime ('org.apache.spark:spark-sql_2.11:2.1.0.cloudera1') {
        exclude group: 'org.roaringbitmap'
    }
    compile 'org.roaringbitmap:RoaringBitmap:0.7.42'
    implementation 'org.roaringbitmap:RoaringBitmap'
    constraints {
        implementation('org.roaringbitmap:RoaringBitmap:0.7.42') {
            because 'because of transitive dependency'
        }
    }
}

gradle -q dependencyInsight --dependency org.roaringbitmap的输出表明依赖关系已更新

org.roaringbitmap:RoaringBitmap -> 0.7.42
   variant "default+runtime" [
      org.gradle.status = release (not requested)
      Requested attributes not found in the selected variant:
         org.gradle.usage  = java-api
   ]
\--- compileClasspath

org.roaringbitmap:RoaringBitmap:0.5.11 -> 0.7.42
   variant "default+runtime" [
      org.gradle.status = release (not requested)
      Requested attributes not found in the selected variant:
         org.gradle.usage  = java-api
   ]
\--- org.apache.spark:spark-core_2.11:2.1.0.cloudera1
     +--- compileClasspath
     +--- org.apache.spark:spark-sql_2.11:2.1.0.cloudera1
     |    \--- compileClasspath
     \--- org.apache.spark:spark-catalyst_2.11:2.1.0.cloudera1
          \--- org.apache.spark:spark-sql_2.11:2.1.0.cloudera1 (*)

不幸的是,当我使用spark2-submit运行应用程序时,运行时依赖项的实际版本为org.roaringbitmap:RoaringBitmap:0.5.11

如何强制我的应用程序使用所需版本的RoaringBitmap?

2 个答案:

答案 0 :(得分:0)

我相信CDH提供的库始终优先于您的库。

您可以使用spark2-shell中的下一段代码进行检查:

import java.lang.ClassLoader
val cl = ClassLoader.getSystemClassLoader
cl.asInstanceOf[java.net.URLClassLoader].getURLs.foreach(println)

通常我使用shade插件来克服它。

答案 1 :(得分:0)

Spark可以选择优先选择用户类路径。 Classpath resolution between spark uber jar and spark-submit --jars when similar classes exist in both

很可能您还应该研究阴影。