用urllib2.urlopen()正确打开关闭文件

时间:2010-10-07 10:31:45

标签: python exception-handling urllib2 pys60

我在python脚本中有以下代码

  try:
    # send the query request
    sf = urllib2.urlopen(search_query)
    search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
    sf.close()
  except Exception, err:
    print("Couldn't get programme information.")
    print(str(err))
    return

我很担心,因为如果我在sf.read()上遇到错误,则不会调用sf.clsoe()。 我尝试将sf.close()放在finally块中,但如果urlopen()上有例外,则无法关闭文件,我在finally块中遇到异常!

然后我尝试了

  try:
    with urllib2.urlopen(search_query) as sf:
      search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
  except Exception, err:
    print("Couldn't get programme information.")
    print(str(err))
    return

但是这会在with...行引发无效的语法错误。 我怎么能最好地处理这个问题,我觉得很蠢!

正如评论者指出的那样,我使用的是Pys60,它是python 2.5.4

8 个答案:

答案 0 :(得分:17)

我会使用contextlib.closing(与旧的Python版本的__future__ import with_statement结合使用):

from contextlib import closing

with closing(urllib2.urlopen('http://blah')) as sf:
    search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())

或者,如果你想避免使用with语句:

try:
    sf = None
    sf = urllib2.urlopen('http://blah')
    search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
finally:
    if sf:
        sf.close()

虽然不太优雅。

答案 1 :(得分:8)

finally:
    if sf: sf.close()

答案 2 :(得分:6)

为什么不尝试关闭sf,如果它不存在则传递?

import urllib2
try:
    search_query = 'http://blah'
    sf = urllib2.urlopen(search_query)
    search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except urllib2.URLError, err:
    print(err.reason)
finally:
    try:
        sf.close()
    except NameError: 
        pass

答案 3 :(得分:1)

鉴于您尝试使用'with',您应该使用Python 2.5,然后这也适用:http://docs.python.org/tutorial/errors.html#defining-clean-up-actions

答案 4 :(得分:1)

如果urlopen()有异常,请捕获它并调用异常的close()函数,如下所示:

try:
    req = urllib2.urlopen(url)
    req.close()
    print 'request {0} ok'.format(url)
except urllib2.HTTPError, e:
    e.close()
    print 'request {0} failed, http code: {1}'.format(url, e.code)
except urllib2.URLError, e:
    print 'request {0} error, error reason: {1}'.format(url, e.reason)

该异常也是一个完整的响应对象,您可以看到此问题消息:http://bugs.jython.org/issue1544

答案 5 :(得分:0)

看起来问题比我想象的要深 - this forum thread表示urllib2在python 2.6之后才实现with,可能直到3.1

答案 6 :(得分:0)

您可以创建自己的通用网址开启工具:

from contextlib import contextmanager

@contextmanager
def urlopener(inURL):
    """Open a URL and yield the fileHandle then close the connection when leaving the 'with' clause."""
    fileHandle = urllib2.urlopen(inURL)
    try:     yield fileHandle
    finally: fileHandle.close()

然后您可以使用原始问题中的语法:

with urlopener(theURL) as sf:
    search_soup = BeautifulSoup.BeautifulSoup(sf.read())

此解决方案可让您清楚地分离问题。您将获得一个干净的通用urlopener语法,该语法处理正确关闭资源的复杂性,而不管您的with子句下面发生的错误。

答案 7 :(得分:0)

为什么不使用多个try / except块?

try:
    # send the query request
    sf = urllib2.urlopen(search_query)
except urllib2.URLError as url_error:
    sys.stderr.write("Error requesting url: %s\n" % (search_query,))
    raise

try:
    search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except Exception, err: # Maybe catch more specific Exceptions here
    sys.stderr.write("Couldn't get programme information from url: %s\n" % (search_query,))
    raise # or return as in your original code
finally:
    sf.close()