使用Python请求登录网站

时间:2019-09-02 19:06:32

标签: javascript html python-3.x web-scraping python-requests

网站:

https://account.reverso.net/login/context.reverso.net/it?utm_source=contextweb&utm_medium=usertopmenu&utm_campaign=login

在我看来,该网站生成了cookieAccessToken(位于标头get请求中)来授权访问,但是如何生成字符串? (我认为在analytics.js中,但我不知道javascript语言)

我试图盲目复制在“网络”标签中找到的标头,但显然不起作用:

# -*- coding: utf-8 -*-
import requests
from bs4 import BeautifulSoup


hds={

'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'
}


headerlog = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'it-IT,it;q=0.9,en-US;q=0.8,en;q=0.7',
'Connection': 'keep-alive',
'Cookie': 'ASP.NET_SessionId=0werogy0wfai4sv53bngxvg0; reverso.net.LanguageInterface=it; experiment_translator_gws5LzeuR=1; reverso.net.newpv=0; cookiescriptload=http%3A%2F%2Fwww.reverso.net%2Ftext_translation.aspx%3Flang%3DIT; _ga=GA1.2.201389114.1567443945; _gid=GA1.2.1927785107.1567443945; _gat=1; _fbp=fb.1.1567443945191.1175342048; __qca=P0-1108934771-1567443945206; reverso.net.ReversoAccessToken=pKXId9aSZt3uw1uTuhno5gv1uexLHKJ6j-oDuimbZXJlFux0_76ffV8Ft6J5O2i5KpQ_kGWLRzqomPWdE2H-2ToCzM5en3p2J65ra1AADXBHMRWl9jVBj8U39XhiL-6yrnDH6_rXyW79bHfgNzEGjpjcrBOzSBHYbrTlM6yJF_tKyPM6tm4-KuNOOMSBPeHLdC-Z5eAV_HJ9-iz3af44lO7EDv-ww3__zn9P12qrz4ozfySUuZxOC41TXwJqUq5y9JAfJ60W-1NFVx-K0x33NpxigVOAoRtHEJXzgBHhpY9HJbMhLOCh1YDqXUg8RZE195hNE_n9UgOPAo8c-Z--xkJR_YM; reverso.net.ReversoRefreshToken=415871da-6d90-4b9e-9e99-6290f952383d; reverso.net.DeviceId=e00b0c24-06f8-4379-9571-8b275cbbf9c7',
'Host': 'www.reverso.net',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'
}



with requests.Session() as s:
    url = 'https://account.reverso.net/login/context.reverso.net/it?utm_source=contextweb&utm_medium=usertopmenu&utm_campaign=login'
    r = s.get(url, headers=hds)
    soup = BeautifulSoup(r.content, 'html.parser')

    print (soup)
    r = s.get('https://context.reverso.net/history', headers=headerlog)

    print(soup)

1 个答案:

答案 0 :(得分:1)

您只需发布到网址:

s.post(login_url, "Email=xxx@y.com&Password=zzzzz", headers={"content-type": "application/x-www-form-urlencoded"})

如果可行,他们将发送一个会话cookie,您便已登录。

相关问题