从stanford corenlp获取令牌索引

时间:2015-09-20 20:59:44

标签: nlp stanford-nlp

我想使用CoreNLP获取令牌的索引。我可以获得所有令牌注释,如POS,NER,但是索引返回null。 这是代码:

import edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.SentenceIndexAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.IndexAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.PartOfSpeechAnnotation;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.ling.IndexedWord;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import java.util.Properties;
import edu.stanford.nlp.util.CoreMap;    

protected StanfordCoreNLP pipeline;
private void getPipeline(){
    // creates a StanfordCoreNLP object, with POS tagging, lemmatization, parsing
    Properties props = new Properties();
    props.put("annotators", "tokenize, ssplit, pos, lemma, parse");
    pipeline = new StanfordCoreNLP(props);
}

private Annotation annotatedocument(String text) {
    Annotation document = new Annotation(text);

    // run all Annotators on this text
    pipeline.annotate(document);
    return document;    
}

private  void  getAnnotation(String sentence){  
    Annotation annotation = annotatedocument(sentence);
    List<CoreMap> sentences = annotation.get(SentencesAnnotation.class);
    int numSent=sentences.size();
    assert (numSent == 1):"Number of sentences in annotation  is > 1: " + numSent;

    for(CoreMap sentence: sentences) {  
        List<CoreLabel> tokens = sentence.get(TokensAnnotation.class);
        for (CoreLabel token: tokens) { 
            String word = token.get(TextAnnotation.class);
            Integer index= token.get(IndexAnnotation.class);
            String pos=token.getString(PartOfSpeechAnnotation.class);
        }
    }
}

我得到了单词和POS正确但索引是null

0 个答案:

没有答案