使用Python请求访问经过身份验证的页面

时间:2014-11-21 04:32:37

标签: python authentication python-requests scrape

我试图编写一个简单的刮刀来获取我的互联网帐户的使用详情 - 我已经使用Powershell成功编写了它,但我想将其移至Python以方便使用/部署。如果我打印r.text(POST到登录页面的结果),我只需再次获取登录页面表单详细信息。

我认为解决方案可能与使用prepare_request有关吗?抱歉,如果我错过了一些非常明显的东西,自从我触及python ^^

以来已经有5年了
import requests
USERNAME = 'usernamehere'
PASSWORD = 'passwordhere'
loginURL = 'https://myaccount.amcom.com.au/ClientLogin.aspx'
secureURL = 'https://myaccount.amcom.com.au/FibreUsageDetails.aspx'

session = requests.session()
req_headers = {'Content-Type': 'application/x-www-form-urlencoded'}

formdata = {
    'ctl00$MemberToolsContent$txtUsername': USERNAME,
    'ctl00$MemberToolsContent$txtPassword': PASSWORD,
    'ctl00$MemberToolsContent$btnLogin' : 'Login'
}

session.get(loginURL)
r = session.post(loginURL, data=formdata, headers=req_headers, allow_redirects=False)
r2 = session.get(secureURL)

我在尝试中引用了这些主题:

HTTP POST and GET with cookies for authentication in python Authentication and python Requests

Powershell脚本供参考:

$r=Invoke-WebRequest -Uri 'https://myaccount.amcom.com.au/ClientLogin.aspx' -UseDefaultCredentials -SessionVariable RequestForm
$r.Forms[0].Fields['ctl00$MemberToolsContent$txtUsername'] = "usernamehere"
$r.Forms[0].Fields['ctl00$MemberToolsContent$txtPassword'] = "passwordhere"
$r.Forms[0].Fields['ctl00$MemberToolsContent$btnLogin'] = "Login"

$response = Invoke-WebRequest -Uri 'https://myaccount.amcom.com.au/ClientLogin.aspx' -WebSession $RequestForm -Method POST -Body $r.Forms[0].Fields -ContentType 'application/x-www-form-urlencoded'
$response2 = Invoke-WebRequest -Uri 'https://myaccount.amcom.com.au/FibreUsageDetails.aspx' -WebSession $RequestForm

1 个答案:

答案 0 :(得分:1)

import requests
import re
from bs4 import BeautifulSoup

user="xyzmohsin"
passwd="abcpassword"

s=requests.Session()
headers={"User-Agent":"Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36"}
s.headers.update(headers)

login_url="https://myaccount.amcom.com.au/ClientLogin.aspx"
r=s.get(login_url)
soup=BeautifulSoup(r.content)
RadMasterScriptManager_TSM=soup.find(src=re.compile("RadMasterScriptManager_TSM"))['src'].split("=")[-1]
EVENTTARGET=soup.find(id="__EVENTTARGET")['value']
EVENTARGUMENT=soup.find(id="__EVENTARGUMENT")['value']
VIEWSTATE=soup.find(id="__VIEWSTATE")['value']
VIEWSTATEGENERATOR=soup.find(id="__VIEWSTATEGENERATOR")['value']


data={"RadMasterScriptManager_TSM":RadMasterScriptManager_TSM,
"__EVENTTARGET":EVENTTARGET,
"__EVENTARGUMENT":EVENTARGUMENT,
"__VIEWSTATE":VIEWSTATE,
"__VIEWSTATEGENERATOR":VIEWSTATEGENERATOR,
"ctl00_TopMenu_RadMenu_TopNav_ClientState":"",
"ctl00%24MemberToolsContent%24HiddenField_Redirect":"",
"ctl00%24MemberToolsContent%24txtUsername":user,
"ctl00%24MemberToolsContent%24txtPassword":passwd,
"ctl00%24MemberToolsContent%24btnLogin":"Login"}

headers={"Content-Type":"application/x-www-form-urlencoded",
"Host":"myaccount.amcom.com.au",
"Origin":"https://myaccount.amcom.com.au",
"Referer":"https://myaccount.amcom.com.au/ClientLogin.aspx"}

r=s.post(login_url,data=data,headers=headers)

我没有用户名和密码,因此无法在最终的帖子请求中测试标题。如果它不起作用 - 请从最终的帖子请求标题中删除HostOriginReferer

希望有所帮助: - )