我们如何编写用于解析的梵语语法规则?

时间:2016-10-13 04:50:40

标签: python unicode nltk codec

我们怎样才能在NLTK Python中编写用于解析的梵语语法规则? Python NLTK中是否有可用的标记语料库?

我像往常一样试着写一个语法:

grammar = CFG.fromstring("""
S -> NP VP
PP -> P NP
NP ->  NN JJ| NNP VP| 'I'
VP -> V NP | VP PP
NN -> u'बालः' | u'पुस्तकं'|u'कागदम्'
VP -> u'भजति'|u'अधावत्' |u'अर्चयन्ति' 
NNP -> u'हरिं '
""")

但它返回如下错误:

File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 519, in fromstring
    encoding=encoding)
  File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 1245, in read_grammar
    lines = input.split('\n')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 76: ordinal not in range(128)

我从python 3开始,但即使安装了nltk软件包,它也会返回错误'ImportError:No module named'nltk''。任何人都可以告诉我如何为python 3安装NLTK,以及它为什么会出现这样的错误消息?

0 个答案:

没有答案