通过R建立到另一台计算机的SSH隧道以访问postgreSQL表

时间:2016-07-05 21:03:13

标签: r ssh

作为我的一个项目的R工作流程的一部分,我从位于远程服务器上的postgreSQL表中加载数据。

我的代码看起来像这样(匿名凭据)。

我首先打开与终端 中远程服务器 的ssh连接。

ssh -p Port -L LocalPort:IP:RemotePort servername"

然后我连接到R。

中的postgres数据库
# Load the RPostgreSQL package
library("RPostgreSQL")

# Create a connection
Driver <- dbDriver("PostgreSQL") # Establish database driver
Connection <- dbConnect(Driver, dbname = "DBName", host = "localhost", port = LocalPort, user = "User")

# Download the data
Data<-dbGetQuery(Connection,"SELECT * FROM remote_postgres_table")

这种方法运行正常,我可以毫无问题地下载数据。

但是,我想在R中而不是在终端中执行第一步 - 即创建ssh连接。这是我尝试这样做的,伴随着错误。

# Open the ssh connection in R
system("ssh -T -p Port -L LocalPort:IP:RemotePort servername")

# Load the RPostgreSQL package
library("RPostgreSQL")

# Create a connection
Driver <- dbDriver("PostgreSQL") # Establish database driver
Connection <- dbConnect(Driver, dbname = "DBName", host = "localhost", port = LocalPort, user = "User")

# Download the data
Data<-dbGetQuery(Connection,"SELECT * FROM remote_postgres_table")

Error in postgresqlExecStatement(conn, statement, ...) : 
RS-DBI driver: (could not Retrieve the result : server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

为了澄清我的问题,我想在R中完全执行整个工作流程(建立连接,下载postgreSQL数据),而不需要在终端中执行任何步骤。

3 个答案:

答案 0 :(得分:3)

根据@ r2evans的建议。

##### Starting the Connection #####
# Start the ssh connection to server "otherhost"
system2("ssh", c("-L8080:localhost:80", "-N", "-T", "otherhost"), wait=FALSE)

您可以通过手动查找并输入pid来终止该过程,也可以通过终止与您的服务器名称匹配的所有pid来自动终止该过程。如果您正在使用不太可能在其他进程中重复的相对唯一的服务器名称,请注意您只想使用后一版本。

##### Killing the Connection: Manually #####
# To end the connection, find the pid of the process
system2("ps",c("ax | grep otherhost"))
# Kill pid (x) identified by the previous grep.
tools::pskill(x)

##### Killing the Connection: Automatically #####
# To end the connection, find the pid of the process
GrepResults<-system2("ps",c("ax | grep otherhost"),stdout=TRUE)
# Parse the pids from your grep into a numeric vector
Processes<-as.numeric(sub(" .*","",GrepResults)) 
# Kill all pids identified in the grep
tools::pskill(Processes)

答案 1 :(得分:1)

作为替代方案,您可以将plinkshell

一起使用
library(RPostgreSQL)
drv  <- dbDriver("PostgreSQL")

cmd<- paste0(
  "plink ",
  # use key and run in background process
  " -i ../.ssh/id_rsa -N -batch  -ssh",
  # port forwarding
  " -L 5432:127.0.0.1:5432",
  # location of db
  " user@db.com"
)

shell( cmd, wait=FALSE)
# sleep a while before the the connection been established. 
Sys.sleep(5)

conn <- dbConnect(
  drv,
  host = "127.0.0.1",
  port=5432,
  dbname="mydb",
  password = "pass"
)

dbListTables(conn)

答案 2 :(得分:0)

仅使用R软件包的解决方案:

cmd <- 'ssh::ssh_tunnel(ssh::ssh_connect(host = "aster@lovizaim.avangardpc.ru:22", passwd = "m1C5jOZy"), port = 5555, target = "127.0.0.1:3306")'
pid <- sys::r_background(
    std_out = FALSE,
    std_err = FALSE,
    args = c("-e", cmd)
)
con <- DBI::dbConnect(
    drv = RMariaDB::MariaDB(),
    host = "127.0.0.1",
    port = 5555,
    user = "user",
    password = "pass",
    dbname = "db"
)
# do somehting
DBI::dbDisconnect(con)

使用sysssh软件包制作隧道

另请参阅此comment