R和Hive之间的连接(在Spark上)

时间:2018-05-09 10:09:00

标签: r apache-spark jdbc hive

我正在尝试连接R和Hive(Spark)。 在我的桌面上(Windows 10,R-3.4.2)它运行良好,但在R-server(Linux,R - 3.4.4)上我收到错误:

library(rJava)
library(RJDBC)
driver <- JDBC("org.apache.hive.jdbc.HiveDriver", "~/Drivers/Spark/hive-jdbc-1.2.1-spark2-amzn-0.jar",identifier.quote="`")
url <- "jdbc:hive2://<MyIP>:10001"
conn <- dbConnect(driver, url) 
Error in .jcall(drv@jdrv,"Ljava/sql/Connection;", "connect", as.character(url)[1],  : java.lang.NoClassDefFoundError: org/apache/http/client/CookieStore

怎么了?

1 个答案:

答案 0 :(得分:2)

我找到了解决方案:

library(rJava)
library(RJDBC)

options(java.parameters = '-Xmx256m')
hadoop_jar_dirs <- c('//home//ubuntu//spark-jdbc//')
clpath <- c()
for (d in hadoop_jar_dirs) {
  clpath <- c(clpath, list.files(d, pattern = 'jar', full.names = TRUE))
}
.jinit(classpath = clpath)
.jaddClassPath(clpath)

hive_jdbc_jar <- 'hive-jdbc-1.2.1-spark2-amzn-0.jar'
hive_driver <- 'org.apache.hive.jdbc.HiveDriver'
hive_url <- 'jdbc:hive2://<MyIP>:10001'
drv <- JDBC(hive_driver, hive_jdbc_jar)
conn <- dbConnect(drv, hive_url)
show_databases <- dbGetQuery(conn, "show databases")
show_databases
相关问题