为什么urllib2对我不起作用?

时间:2010-10-22 13:36:46

标签: python ubuntu urllib2

我已经使用python 2.6.5在我的ubuntu 10.04 32位机器上安装了3个不同的python脚本。

所有这些都使用了urllib2,我总是得到这个错误:

urllib2.URLError: <urlopen error [Errno 110] Connection timed out>

为什么?

示例:

>>> import urllib2
>>> response = urllib2.urlopen("http://www.google.com")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1161, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.6/urllib2.py", line 1136, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 110] Connection timed out>



>>> response = urllib2.urlopen("http://search.twitter.com/search.atom?q=hello&rpp=10&page=1")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1161, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.6/urllib2.py", line 1136, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 110] Connection timed out>

更新:

$ ping google.com
PING google.com (72.14.234.104) 56(84) bytes of data.
64 bytes from google.com (72.14.234.104): icmp_seq=1 ttl=54 time=25.3 ms
64 bytes from google.com (72.14.234.104): icmp_seq=2 ttl=54 time=24.6 ms
64 bytes from google.com (72.14.234.104): icmp_seq=3 ttl=54 time=25.1 ms
64 bytes from google.com (72.14.234.104): icmp_seq=4 ttl=54 time=25.0 ms
64 bytes from google.com (72.14.234.104): icmp_seq=5 ttl=54 time=23.9 ms
^C
--- google.com ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4003ms
rtt min/avg/max/mdev = 23.959/24.832/25.365/0.535 ms


$ w3m http://www.google.com
w3m: Can't load http://www.google.com.

$ telnet google.com 80
Trying 1.0.0.0...
telnet: Unable to connect to remote host: Connection timed out

更新2:

我在家,我正在使用路由器和接入点: - 。但是我刚才注意到Firefox不适用于我。但是Chrome,突触和其他浏览器如Midori和Epiphany等确实有效。

更新3:

>>> useragent = 'Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Ubuntu/10.04 Chromium/6.0.472.62 Chrome/6.0.472.62 Safari/534.3)'
>>> request = urllib2.Request('http://www.google.com/')
>>> request.add_header('User-agent', useragent )
>>> urllib2.urlopen(request)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1161, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.6/urllib2.py", line 1136, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 110] Connection timed out>

更新4:

>>> socket.setdefaulttimeout(50)
>>> urllib2.urlopen('http://www.google.com')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1161, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.6/urllib2.py", line 1136, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 110] Connection timed out>

更新5:

Wireshark结果(数据包嗅探器):

Firefox:http://bit.ly/chtynm

Chrome:http://bit.ly/9ZjILK

Midori:http://bit.ly/cKilC4

midori是另一款适合我的浏览器。只有Firefox不起作用。

8 个答案:

答案 0 :(得分:4)

根据建议,首先排除网络设置故障。

首先,检查您是否可以ping您尝试连接的主机:

$ ping www.google.com

然后使用例如w3m

尝试HTTP连接
$ w3m http://www.google.com

答案 1 :(得分:3)

我现在只能想到一个raeson,XRobot他们不信任你。

他们呢?他们:))

当你想做一些爬行或刮痧而你发现他们不信任你时,你只需要抛弃它们,那是怎么回事?

首先,你应该知道它们包含的一些网络服务器过滤器,如机器人等恶意软件(也许他们知道你是机器人,嗯XRobot :)),他们是如何做到的?有很多方法可以过滤:比如在网页中使用验证码,按用户代理进行过滤......

因为你的ICMP ping工作,chrome浏览器工作但不是w3m我建议你像这样更改用户代理:

user_agent = 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.10) Gecko/20100915\
              Ubuntu/10.04 (lucid) Firefox/3.6.10'

request = urllib2.Request('http://www.google.com/')
request.add_header('User-agent', user_agent )

opener.open(request)

也许我在这里得到了偏执狂,但希望这可以帮助你:)

答案 2 :(得分:1)

您要连接的是哪个网址?此错误可能有多种原因,其中大多数都与错误的名称或IP地址或远程主机链接有问题。

答案 3 :(得分:1)

听起来像chrome和synaptic可能正在使用HTTP代理。在铬中,转到选项/引擎盖/更改代理设置。使用以下命令检查gnome代理设置:

$ gconftool-2 -R /system/proxy

答案 4 :(得分:0)

您是否测试过网络连接?由于断绝连接或拒绝连接,另一端的东西没有响应。

另外,发布你正在使用的python版本。

更新:

这几乎肯定是一个网络问题。我还有一台带有Python 2.6.5的Ubuntu 10.04机器(32位),这是一个近乎原始的安装,我无法重现这个问题。

Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> response = urllib2.urlopen("http://www.google.com")
>>> print response.read(100)
<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><

答案 5 :(得分:0)

逐个执行这些步骤 -

  1. 检查您是否已连接&amp;它正在工作。 ping google.com
  2. 如果一切正常,&amp;你的互联网连接速度很慢,然后这样做 -

    <强> import socket
    socket.setdefaulttimeout(300) #in seconds.

  3. 这会延长套接字的超时时间。

答案 6 :(得分:0)

我遇到了类似的行为。最后我记得我之前运行了一个安装代理的脚本。从urllib2中删除代理解决了我的问题。这并不能解释你的telnet或w3m的谜团,但它可以帮助有urllib2部分的人。

此页面帮助我弄清楚如何删除代理。

http://www.decalage.info/en/python/urllib2noproxy

以下是代码:

proxy_handler = urllib2.ProxyHandler({})
opener = urllib2.build_opener(proxy_handler)
urllib2.install_opener(opener)

答案 7 :(得分:0)

我认为权限存在一些问题。我在Ubuntu 11.10上遇到了同样的问题。用sudo调用python为我做了伎俩。试试看;

jeffisabelle:~ $ python
Python 2.7.2+ (default, Oct  4 2011, 20:03:08) 
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.

>>> import urllib2
>>> response = urllib2.urlopen("http://www.google.com")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 394, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 412, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 372, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1201, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1171, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 110] Connection timed out>


jeffisabelle:~ $ sudo python
Python 2.7.2+ (default, Oct  4 2011, 20:03:08) 
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> response = urllib2.urlopen("http://www.google.com")
>>>