如何使用Rcurl登录本网站,下载文件

时间:2015-06-09 19:21:13

标签: r rcurl

我需要编写一段代码,从需要登录的网站下载数据文件 我认为这很容易,但是我在编程方面遇到了登录问题。

我尝试使用这篇文章中列出的步骤:
How to login and then download a file from aspx web pages with R

但是当我从最后一步到达第二步时,我得到一条错误信息:
Error: Internal Server Error

所以我正在尝试编写RCurl代码来登录该站点,然后下载文件。 这是我尝试过的:

install.packages("RCurl")
library(RCurl)

curl = getCurlHandle()
curlSetOpt(cookiejar = 'cookies.txt', .opts = list(ssl.verifypeer = FALSE),        followlocation = TRUE, autoreferer = TRUE, curl= curl)

html <- getURL('https://research.valueline.com/secure/f2/export?params=[{appId:%27com_2_4%27,%20context:{%22Symbol%22:%22GT%22,%22ListId%22:%22recent%22}}]', curl = curl)
viewstate <- as.character(sub('.*id="__VIEWSTATE" value="([0-9a-zA-Z+/=]*).*', '\\1', html))

params <- list(
'ctl00$ContentPlaceHolder$LoginControl$txtUserID' = '<myusername>',
    'ctl00$ContentPlaceHolder$LoginControl$txtUserPw'  = '<mypassword>',
    'ctl00$ContentPlaceHolder$LoginControl$btnLogin' = 'Sign In',
    '__VIEWSTATE' = viewstate
    )

html = postForm('https://research.valueline.com/secure/f2/export?params=[{appId:%27com_2_4%27,%20context:{%22Symbol%22:%22GT%22,%22ListId%22:%22recent%22}}]', .params = params, curl = curl)

grepl('Logout', html)

0 个答案:

没有答案