下载NLTK语料库时的ElementTree.ParseError

时间:2016-11-07 05:02:18

标签: python nltk elementtree centos7

我在CentOS机器上安装了nltk 3.2.1 现在,每当我尝试下载NLTK的任何语料库/模型时,它都会给出以下错误:

Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/lib/python2.7/site-packages/nltk/downloader.py", line 2268, in <module>
halt_on_error=options.halt_on_error)
File "/usr/lib/python2.7/site-packages/nltk/downloader.py", line 664, in download
for msg in self.incr_download(info_or_id, download_dir, force):
File "/usr/lib/python2.7/site-packages/nltk/downloader.py", line 534, in incr_download
try: info = self._info_or_id(info_or_id)
File "/usr/lib/python2.7/site-packages/nltk/downloader.py", line 508, in _info_or_id
return self.info(info_or_id)
File "/usr/lib/python2.7/site-packages/nltk/downloader.py", line 875, in info
self._update_index()
File "/usr/lib/python2.7/site-packages/nltk/downloader.py", line 825, in _update_index
ElementTree.parse(compat.urlopen(self._url)).getroot())
File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1182, in parse
tree.parse(source, parser)
File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1642, in feed
self._raiseerror(v)
File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: syntax error: line 1, column 49

请注意,我已尝试以下所有方法下载NLTK数据 -

  • nltk.download()
  • nltk.download('all')
  • python -m nltk.downloader all

但在所有方法中我都收到同样的错误 任何人都知道我为什么会收到此错误以及如何下载NLTK数据? 任何帮助,将不胜感激!

1 个答案:

答案 0 :(得分:1)

让我们看看:您的下载程序会打开列出可用下载的xml文档,尝试解析它并收到错误:

ElementTree.parse(compat.urlopen(self._url)).getroot())

要么(非常不可能)nltk网站不再与Python 2.7兼容,或者您​​实际上没有收到预期的XML文档,因为您的连接有问题。你是behind a proxy?如果没有,你的连接可能还有其他问题。

相关问题