Python的请求或机械化以登录站点

时间:2013-09-04 22:55:55

标签: python mechanize python-requests mechanize-python

我想先道歉。我知道这很可能已经完成了很多次,我只是打败了一匹死马,但我真的很想知道如何让它发挥作用。我正在尝试使用python的Requests模块来登录网站并验证它是否有效。我也在代码中使用BeautifulSoup,以便找到一些我必须用来处理请求的字符串。

我对如何正确形成标题感到困惑。标题信息中究竟需要什么?

import requests
from bs4 import BeautifulSoup

session = requests.session()
requester = session.get('http://iweb.craven.k12.nc.us/index.php')
soup = BeautifulSoup(requester.text)
ps = soup.find_all('input')
def getCookieInfo():
    result = []
    for item in ps:
        if (item.attrs['name'] == 'return' and item.attrs['type'] == 'hidden'):
            strcom = item.attrs['value']
            sibling = item.next_sibling.next_sibling.attrs['name']
            result.append(strcom)
            result.append(sibling)
    return result
cookiedInfo=getCookieInfo()
payload =   [('username','myUsername'), 
         ('password','myPassword'), 
         ('Submit','Log in'), 
         ('option','com_users'), 
         ('task','user.login'), 
         ('return', cookiedInfo[0]),
         (cookiedInfo[1], '1')
        ]

headers = {
    'Connection': 'keep-alive',
    'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Origin':'http://iweb.craven.k12.nc.us',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64)'
}

r = session.post('http://iweb.craven.k12.nc.us/index.php', data=payload, headers=headers)
r = session.get('http://iweb.craven.k12.nc.us')
soup = BeautifulSoup(r.text)

如果使用机械化模块会更好/更pythonic我会接受建议。

0 个答案:

没有答案