python请求库中的https代理支持

时间:2012-10-22 05:15:50

标签: python https proxy urllib2 python-requests

我使用python Requests库来做HTTP相关的事情。我在我的计算机上使用免费的ntlmaps设置代理服务器,作为代理来回答公司ISA服务器的NTLM挑战。但是,响应似乎总是空的,如下所示:

>>> import requests
>>> r = requests.get('https://www.google.com')
>>> r.text
u'<HTML></HTML>\r\n'

虽然http请求中没有这样的问题。而且,当我使用urllib2库时,它可以得到正确的响应。我比较了使用'Requests'和'urllib2'库之间的消息差异,发现'Requests'使用'GET'而'urllib2'使用'CONNECT',如下面所示的原始消息所示(第一个是'Requests'库)。有人知道是否有任何解决方案?这是“请求”库的错误吗?提前谢谢。

22.10.2012 11:01:41 Version 0.9.9.0.1
*** Got client request header.
*** Client header:
=====
GET https://www.google.com/ HTTP/1.1
Host: www.google.com
Proxy-Connection: Keep-Alive
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/0.14.1 CPython/2.7.2 Darwin/12.1.0

*** Client request header does not have 'Content-Length' or 'Transfer-Encoding' parameter and it must not have any body.
*** Replacing values in client header...Done.
*** New client header:
=====
GET https://www.google.com/ HTTP/1.1
Host: www.google.com
Proxy-Connection: Keep-Alive
Accept-Encoding: gzip, deflate, compress
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint, */*
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)

*** Connecting to remote server...(10.220.15.36:9000)...Done.
*** Sending client request header to remote server...Done.
*** Got remote server response header.
*** Remote server header:
=====
HTTP/1.0 200 OK
Content-Type: text/html
Refresh: 0; URL=https://www.google.com/

*** Could not find server 'Content-Length' parameter.
*** Authentication routine started.
*** Authentication not required.
*** Authentication routine finished.
*** Sending remote server response header to client...Done.
*** Sent 15 bytes to client. (all - 0, len - 0)
*** Remote server closed connection. (Server buffer - 0 bytes)
*** No server's data to send to the client. (server's buffer - 0 bytes)
*** Termination conditions detected (remote server closed connection). Stop Request issued.
*** Finishing procedure started.
*** Closing thread...Done.

'urllib2'库发送的消息:

22.10.2012 11:03:49 Version 0.9.9.0.1
*** Got client request header.
*** Client header:
=====
CONNECT www.google.com:443 HTTP/1.0

*** Client request header does not have 'Content-Length' or 'Transfer-Encoding' parameter and it must not have any body.
*** Replacing values in client header...Done.
*** New client header:
=====
CONNECT www.google.com:443 HTTP/1.0
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint, */*
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)

*** Connecting to remote server...(10.220.15.36:9000)...Done.
*** Sending client request header to remote server...Done.
*** Got remote server response header.
*** Remote server header:
=====
HTTP/1.1 407 Proxy Authentication Required ( The ISA Server requires authorization to fulfill the request. Access to the Web Proxy service is denied. )
Via: 1.1 LASISA2
Proxy-Authenticate: Negotiate
Proxy-Authenticate: Kerberos
Proxy-Authenticate: NTLM
Connection: close
Proxy-Connection: close
Pragma: no-cache
Cache-Control: no-cache
Content-Type: text/html
Content-Length: 718

*** Server 'Content-Length' found to be 718.
*** Authentication routine started.
*** Got Error 407 - "Proxy authentication required".
*** Authentication methods allowed: Negotiate, Kerberos, NTLM
*** Using NTLM authentication method.
*** Authorization in progress...
*** Closing connection to the remote server...Done.
*** Building environment for NTLM.
*** Using custom NTLM flags: 06820000
*** NTLM version with LM response only.
*** NTLM Domain/Host/User: IGTMASTER/BEATLES.LOCAL/TFSBVTVA
*** NTLM hashed passwords found.
*** Environment has been built successfully.
*** Connecting to remote server...(10.220.15.36:9000)...Done.
*** Resetting remote server status...Done. (Server buffer - 651 bytes)
*** Remote server buffer flushed.
*** Fake NTLM header with Msg1:
=====
CONNECT www.google.com:443 HTTP/1.0
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint, */*
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Proxy-Connection: Keep-Alive
Proxy-Authorization: NTLM TlRMTVNTUAABAAAABoIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMAAAAAAAAAAwAAAA

*** Sending Fake NTLM header with Msg1...Done.
*** There must be no body to send.
*** Waiting for message 2 from remote server...
*** Got remote server response header.
*** Remote server header:
=====
HTTP/1.1 407 Proxy Authentication Required ( Access is denied. )
Via: 1.1 LASISA2
Proxy-Authenticate: NTLM TlRMTVNTUAACAAAACQAJADgAAAAGgoECnmQdttSFW6oAAAAAAAAAAJAAkABBAAAABQLODgAAAA9JR1RNQVNURVICABIASQBHAFQATQBBAFMAVABFAFIAAQAOAEwAQQBTAEkAUwBBADIABAAaAGkAcwAuAGEAZAAuAGkAZwB0AC4AYwBvAG0AAwAqAGwAYQBzAGkAcwBhADIALgBpAHMALgBhAGQALgBpAGcAdAAuAGMAbwBtAAUAFABhAGQALgBpAGcAdAAuAGMAbwBtAAAAAAA=
Connection: Keep-Alive
Proxy-Connection: Keep-Alive
Pragma: no-cache
Cache-Control: no-cache
Content-Type: text/html
Content-Length: 0

*** Server 'Content-Length' found to be 0.
*** Got NTLM message 2 from remote server.
*** Resetting remote server status...Done. (Server buffer - 0 bytes)
*** Remote server buffer flushed.
*** Sending Fake NTLM header (not body) with Msg3...Done.
*** Fake NTLM header with Msg3:
=====
CONNECT www.google.com:443 HTTP/1.0
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint, */*
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Proxy-Authorization: NTLM TlRMTVNTUAADAAAAGAAYAF4AAAAAAAAAdgAAAAkACQBAAAAACAAIAEkAAAANAA0AUQAAAAAAAAB2AAAABoIAAElHVE1BU1RFUlRGU0JWVFZBQkVBVExFUy5MT0NBTMwaDvCTdLkOsE7vD6Tog1RoolpOLnh4WQ==

*** End of NTLM authorization process.
*** Authentication routine finished.
*** Got remote server response header.
*** Remote server header:
=====
HTTP/1.1 200 Connection established
Proxy-Connection: close
Connection: close
Via: 1.1 LASISA2

*** Remote server response to the 'CONNECT' request. It must not have any body.
*** Authentication routine started.
*** Authentication not required.
*** Authentication routine finished.
*** Sending remote server response header to client...Done.
*** Lowered authentication flags down. As the code is neither 401 nor 407.
*** Successful 'CONNECT' request detected. Going to tunnel mode.
*** Resetting client status...Done. (Client buffer - 114 bytes)
*** Resetting remote server status...Done. (Server buffer - 0 bytes)
*** Request completed.
*** Tunnelled 114 bytes to remote server.
*** Tunnelled 1725 bytes to client.
*** Tunnelled 186 bytes to remote server.
*** Tunnelled 47 bytes to client.
*** Tunnelled 142 bytes to remote server.
*** Tunnelled 4096 bytes to client.
*** Tunnelled 248 bytes to client.
*** Tunnelled 2076 bytes to client.
*** Tunnelled 4096 bytes to client.
*** Tunnelled 1198 bytes to client.
*** Remote server closed connection. (Server buffer - 0 bytes)
*** Termination conditions detected (remote server closed connection). Stop Request issued.
*** Finishing procedure started.
*** Closing thread...Done.

2 个答案:

答案 0 :(得分:1)

https 代理应该使用“CONNECT”urllib2是故意这样做的。 CONNECT建立用于HTTPS的安全传输的隧道。

答案 1 :(得分:0)

据我了解,这是urllib3中的一个错误,它要求在引擎盖下使用。请参阅此错误报告:https://github.com/shazow/urllib3/issues/50