从解析描述生成解析树

时间:2012-05-18 06:41:10

标签: parsing nlp text-processing

我想从英语句子的解析描述(语法分析的简明形式)生成一个解析树(Java Object)。我使用Java也是如此,需要定义一个有效的树。例如。描述:

    (ROOT (S (NP (PRP I)) (VP (MD would) (VP (VB love) (S (VP (TO to) (VP (VB go) (PRT (RP out)) (PP (IN with) (NP (PRP you)))))))) (. .))

1 个答案:

答案 0 :(得分:0)

我终于自己解决了这个问题:)

public static Node getParseTree(String[] parseTokens, ArrayList<Node> leafNodeList)
{
    Node top = new Node("TOP");
    Node rest = getParseTree(parseTokens, 2, top, false, leafNodeList);
    return top;
}

public static Node getParseTree(String[] parseTokens, int currIndex, Node lastNode, Boolean closeBrace, ArrayList<Node> leafNodeList)
{
    if(currIndex>=parseTokens.length) return lastNode;
    else if("(".equals(parseTokens[currIndex]))
    {
        Node newNode = lastNode.addChild(parseTokens[currIndex+1]);//The next token is the data for the new node constructed
        return getParseTree(parseTokens, currIndex+2, newNode, false, leafNodeList);
    }
    else if(")".equals(parseTokens[currIndex]))
    {
        if(closeBrace) return getParseTree(parseTokens, currIndex+1, lastNode.getParent(), true, leafNodeList);
        else return getParseTree(parseTokens, currIndex+1, lastNode, true, leafNodeList);
    }
    else //leaf node 
    {
        Node newNode = lastNode.addChild(parseTokens[currIndex]);
        leafNodeList.add(newNode);
        return getParseTree(parseTokens, currIndex+2, lastNode.getParent(), true, leafNodeList);
    }       
}

Node test(String parseDesc)
{
        parseDesc = parseDesc.replace("(", " ( ");
        parseDesc = parseDesc.replace(")", " ) ");
        String[] parseDescTokens = parseDesc.trim().split("\\s+");
        Node treeReqd = getParseTree(parseDescTokens, leafNodes);// Required Tree
}