在斯坦福NER中标记全名

时间:2014-09-15 07:39:47

标签: php named-entity-recognition

我试图将全名标记为完整标记(一个人)而不是单个标记。这是一个例子。

http://nlp.stanford.edu:8080/ner/process(斯坦福NER在线)

例句:穆罕默德·阿里是一位伟大的拳击手。阿里最伟大的竞争对手是乔弗雷泽。这个名字也可以写成M. Ali和J. Frazier。

这是我现有的PHP代码

$text = "Muhammad Ali was a great boxer. Ali's greatest rival was Joe Frazier. The name can also   be written as M. Ali and J. Frazier";

$pos = new \StanfordNLP\NERTagger(
          'XPATH/NER/StanfordNLP/stanford-ner-2013-11-12/classifiers/english.conll.4class.distsim.crf.ser.gz',
          'XPATH/NER/StanfordNLP/stanford-ner-2013-11-12/stanford-ner.jar'
);
$result = $pos->tag(explode(' ', " $text")); 

foreach ($result as $eType)
{

    if(!(strcmp($eType[1], 'PERSON')))
    {
         echo "Word ".$eType[0]." of Stanford entity type PERSON</br>";                   
    }
}`

1 个答案:

答案 0 :(得分:2)

没关系,我能够自己解决它。基本上,如果前一个单词也是实体类型的人,我专注于组合单词。这是我提出的代码

<?php
        ini_set('max_execution_time', 300); //300 seconds = 5 minutes
        require "./php_aho_corasick-master/AhoCorasickPHP-master/AhoCorasick.php";
        require "./php_aho_corasick-master/AhoCorasickPHP-master/TreeNodes.php";
        include_once('AlchemyAPI/alchemyapi.php');
        include_once('TextStatistics/TextStatistics.php');

        require './NER/StanfordNLP/Base.php';
        require './NER/StanfordNLP/Exception.php';
        require './NER/StanfordNLP/Parser.php';
        require './NER/StanfordNLP/StanfordTagger.php';
        require './NER/StanfordNLP/NERTagger.php';
        $text= "Muhammad Ali was a great boxer, Ali's greatest rival was Joe Frazier, The name can also be written as M. Ali and J. Frazier.";
        $pos = new \StanfordNLP\NERTagger(
          'C:/wamp/www/GoogleResultsParserTopK/NER/StanfordNLP/stanford-ner-2013-11-12/classifiers/english.conll.4class.distsim.crf.ser.gz',
          'C:/wamp/www/GoogleResultsParserTopK/NER/StanfordNLP/stanford-ner-2013-11-12/stanford-ner.jar'
        );            
        $a="Answer not found";
        //$pos->setJavaPath('C:/Program Files/Java/jdk1.7.0_45/bin');
        $result = $pos->tag(explode(' ', " $text")); 

        var_dump($result);


        $previousValue="";
        $previousType="";
        $FullName="";
        $i=0;
        foreach ($result as $eType) {

            echo $i." ".$FullName."</br>";
            $i++;
            if(!(strcmp($eType[1], 'PERSON')))
            {
                if(!(strcmp($previousType, 'PERSON')) && !(strcmp($FullName, "")))
                {
                    $FullName=$previousValue." ".$eType[0];
                }
                else if(!(strcmp($previousType, 'PERSON')) && (strcmp($FullName, "")))
                {
                    $FullName=$FullName." ".$eType[0];
                }
                else if(!(strcmp($a, "Answer not found")) && !(strcmp($FullName, "")))
                    $FullName=$eType[0];
                else if((strcmp($FullName, "")))
                    $FullName=$FullName." or ".$eType[0];


            }

            $previousValue=$eType[0];
            $previousType=$eType[1];
        }           
        echo $FullName;

&GT;