Question

全部 -

运行Stanford CoreNLP 3.4.1，加上西班牙模型。我有一个大约100个西班牙文原始文档的目录，UTF-8编码。对于每一个，我执行以下命令行：

java -cp stanford-corenlp-3.4.1.jar:stanford-spanish-corenlp-2014-08-26-models.jar:xom.jar:joda-time.jar:jollyday.jar:ejml-0.23.jar -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props <propsfile> -file <txtfile>

道具文件如下所示：

annotators = tokenize, ssplit, pos
tokenize.language = es
pos.model = edu/stanford/nlp/models/pos-tagger/spanish/spanish-distsim.tagger

对于几乎每个文件，我都会收到以下错误：

线程“main”中的异常java.lang.RuntimeException：错误注释：在edu.stanford.nlp.pipeline.StanfordCoreNLP $ 15.run（StanfordCoreNLP.java:1287）在edu.stanford.nlp.pipeline.StanfordCoreNLP.processFiles（StanfordCoreNLP.java:1347）在edu.stanford.nlp.pipeline.StanfordCoreNLP.run（StanfordCoreNLP.java:1389）在edu.stanford.nlp.pipeline.StanfordCoreNLP.main（StanfordCoreNLP.java:1459）引起：java.lang.NullPointerException 在edu.stanford.nlp.tagger.maxent.ExtractorSpanishStrippedVerb.extract（ExtractorFramesRare.java:1626）在edu.stanford.nlp.tagger.maxent.Extractor.extract（Extractor.java:153）在edu.stanford.nlp.tagger.maxent.TestSentence.getExactHistories（TestSentence.java:465）在edu.stanford.nlp.tagger.maxent.TestSentence.getHistories（TestSentence.java:440）在edu.stanford.nlp.tagger.maxent.TestSentence.getHistories（TestSentence.java:428）在edu.stanford.nlp.tagger.maxent.TestSentence.getExactScores（TestSentence.java:377）在edu.stanford.nlp.tagger.maxent.TestSentence.getScores（TestSentence.java:372）在edu.stanford.nlp.tagger.maxent.TestSentence.scoresOf（TestSentence.java:713）在edu.stanford.nlp.sequences.ExactBestSequenceFinder.bestSequence（ExactBestSequenceFinder.java:91）在edu.stanford.nlp.sequences.ExactBestSequenceFinder.bestSequence（ExactBestSequenceFinder.java:31）在edu.stanford.nlp.tagger.maxent.TestSentence.runTagInference（TestSentence.java:322）在edu.stanford.nlp.tagger.maxent.TestSentence.testTagInference（TestSentence.java:312）在edu.stanford.nlp.tagger.maxent.TestSentence.tagSentence（TestSentence.java:135）在edu.stanford.nlp.tagger.maxent.MaxentTagger.tagSentence（MaxentTagger.java:998）在edu.stanford.nlp.pipeline.POSTaggerAnnotator.doOneSentence（POSTaggerAnnotator.java:147）在edu.stanford.nlp.pipeline.POSTaggerAnnotator.annotate（POSTaggerAnnotator.java:110）在edu.stanford.nlp.pipeline.AnnotationPipeline.annotate（AnnotationPipeline.java:67）在edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate（StanfordCoreNLP.java:847）在edu.stanford.nlp.pipeline.StanfordCoreNLP $ 15.run（StanfordCoreNLP.java:1275）

有什么想法吗？我甚至没有开始追踪这一点。我确定问题出在POS上; tokenize和ssplit运行得很好。

P.S。请不要说“升级到3.5.0”;我目前没有安装Java 8，也不想安装它。

提前致谢。

Answer 1

是的，似乎3.4.1西班牙语模型中存在错误。

西班牙语3.5.0模型实际上似乎与Java 7兼容。您可以下载3.5（stanford-spanish-corenlp-2014-10-23-models.jar）中使用的模型，并将其放在类路径上。这解决了我在本地运行Java 7的问题。

NullPointerException与Stanford NLP西班牙语POS标记

1 个答案: