Python:HTTPError:HTTP错误403:不良行为

时间:2017-05-26 15:42:53

标签: python web-scraping http-error

我正在尝试阅读网页以从中提取内容。请在下面找到代码。

url = "http://www.sanjamar.com/product-categories/bar/bar-tools/"    
html = urlopen(url).read()    
soup = BeautifulSoup(html)
print(soup)

我最后一次使用不同的网站时,它有效。这次它抛出了以下错误。

HTTPError                                 Traceback (most recent call last)
<ipython-input-83-ccdefd422a61> in <module>()
      1 url = "http://www.sanjamar.com/product-categories/bar/bar-tools/"
----> 2 html = urlopen(url).read()
      3 soup = BeautifulSoup(html)
      4 print(soup)

C:\Users\Santosh\Anaconda3\lib\urllib\request.py in urlopen(url, data, 
timeout, cafile, capath, cadefault, context)
    221     else:
    222         opener = _opener
--> 223     return opener.open(url, data, timeout)
    224 
    225 def install_opener(opener):

C:\Users\Santosh\Anaconda3\lib\urllib\request.py in open(self, fullurl, 
data, timeout)
    530         for processor in self.process_response.get(protocol, []):
    531             meth = getattr(processor, meth_name)
--> 532             response = meth(req, response)
    533 
    534         return response

C:\Users\Santosh\Anaconda3\lib\urllib\request.py in http_response(self, 
request, response)
    640         if not (200 <= code < 300):
    641             response = self.parent.error(
--> 642                 'http', request, response, code, msg, hdrs)
    643 
    644         return response

C:\Users\Santosh\Anaconda3\lib\urllib\request.py in error(self, proto, * 
args)
    568         if http_err:
    569             args = (dict, 'default', 'http_error_default') + 
orig_args
--> 570             return self._call_chain(*args)
    571 
    572 # XXX probably also want an abstract factory that knows when it 
    makes

 C:\Users\Santosh\Anaconda3\lib\urllib\request.py in _call_chain(self, 
 chain,    
    kind, meth_name, *args)
    502         for handler in handlers:
    503             func = getattr(handler, meth_name)
--> 504             result = func(*args)
    505             if result is not None:
    506                 return result

C:\Users\Santosh\Anaconda3\lib\urllib\request.py in http_error_default(self, 
req, fp, code, msg, hdrs)
    648 class HTTPDefaultErrorHandler(BaseHandler):
    649     def http_error_default(self, req, fp, code, msg, hdrs):
--> 650         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    651 
    652 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 403: Bad Behavior

我猜问题是网站阻塞了python。如果没有,请告诉我一个解决方案。

由于

0 个答案:

没有答案
相关问题