如何将Spark DataFrames写入Postgres DB

时间:2016-09-01 14:58:59

标签: postgresql scala apache-spark apache-spark-sql spark-dataframe

我使用Spark 1.3.0 假设我在Spark中有一个数据帧,我需要在64位ubuntu机器上将它存储到Postgres DB(postgresql-9.2.18-1-linux-x64)。 我还使用postgresql9.2jdbc41.jar作为连接到postgres的驱动程序

我能够使用以下命令从postgres DB中读取数据

<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" rel="stylesheet" />
<div class="well carousel">
  <div class="product-detailscar">
    <div class="image-video-linkcar">
      <img alt="#" src="http://www.fillmurray.com/100/200">
      <div class="brandcar">
        BRAND
      </div>

      <div class="categorycar">
        CATEGORY
      </div>

    </div>
  </div>
  <div class="overlay">
    <div class="subcategorycar">
      SUBBBBBBCATEGORY
    </div>

    <div class="idcar">
      IDDDDCAR
    </div>
  </div>
</div>

我想在经过一些处理之后将这个DF写回postgres。 这是下面的代码吗?

import org.postgresql.Driver
val url="jdbc:postgresql://localhost/postgres?user=user&password=pwd"
val driver = "org.postgresql.Driver"

val users = {
  sqlContext.load("jdbc", Map(
    "url" -> url,
    "driver" -> driver,
    "dbtable" -> "cdimemployee",
    "partitionColumn" -> "intempdimkey",
    "lowerBound" -> "0",
    "upperBound" -> "500",
    "numPartitions" -> "50"
  ))
}

val get_all_emp = users.select("*")
val empDF = get_all_emp.toDF
get_all_emp.foreach(println)

任何指针(scala)都会有所帮助。

1 个答案:

答案 0 :(得分:1)

您应该遵循以下代码。

val database = jobConfig.getString("database")
val url: String = s"jdbc:postgresql://localhost/$database"
val tableName: String = jobConfig.getString("tableName")
val user: String = jobConfig.getString("user")
val password: String = jobConfig.getString("password")
val sql = jobConfig.getString("sql")
val df = sc.sql(sql)
val properties = new Properties()
properties.setProperty("user", user)
properties.setProperty("password", password)
properties.put("driver", "org.postgresql.Driver")
df.write.mode(SaveMode.Overwrite).jdbc(url, tableName, properties)
相关问题