java中字母词汇的词法分析

时间:2014-12-06 13:05:47

标签: java

此代码仅分析数字和运算符我只需要它来分析字母并将其放在其他东西上无效我应该在细节中做什么我从教程网站获得此代码并修改它但没有结果。

     package lexical;

     public class Tokenizer {


int pos;
char[] expression;

Tokenizer(String expression) {
    this.expression = expression.toCharArray();
    this.pos = 0;
}

enum Type { OPERATOR, LITTER, UNKNOWN }

class Lexeme {
    String type, token;
    Lexeme(String type, String token) {
        this.type = type;
        this.token = token;
    }
}

Lexeme getNextToken() {
    StringBuilder token = new StringBuilder();
    boolean endOfToken = false;
    Type type = Type.UNKNOWN;
    while (!endOfToken && hasMoreTokens()) {
        while(expression[pos] == ' ' && hasMoreTokens())
            pos++;
        switch (expression[pos]) {
            case '+':
            case '-':
            case '*':
            case '/':
                if(type != Type.LITTER) {
                    type = Type.OPERATOR;
                    token.append(expression[pos]);
                    pos++;
                }
                endOfToken = true;
                break;
            case ' ':
                endOfToken = true;
                pos++;
                break;
            default:
                if(Character.isDigit(expression[pos]) || expression[pos] == '.') {
                    token.append(expression[pos]);
                    type = Type.LITTER;
                } else {
                    System.out.println("Systax error at position: " + pos);
                }
                pos++;
                break;
        }
    }
    return new Lexeme(type.name().toLowerCase(), token.toString());
}

boolean hasMoreTokens() {
    return pos < expression.length;
}

public static void main(String[] args) {
    String expression = "54+18+5";
    Tokenizer tokenizer = new Tokenizer(expression);
    while (tokenizer.hasMoreTokens()) {
        Lexeme nextToken = tokenizer.getNextToken();
        System.out.print("Type: " + nextToken.type + "\tLexeme: " + nextToken.token + "\n");
    }
}


 }

2 个答案:

答案 0 :(得分:0)

此处此处是寻找号码的地方 给出非数字或非句号字符的错误。

if(Character.isDigit(expression[pos]) || expression[pos] == '.') {
     token.append(expression[pos]);
     type = Type.LITTER;
} else {
     System.out.println("Systax error at position: " + pos);
}

字符串可以保存任何东西,所以你只需要删除if语句并执行

default:
    token.append(expression[pos]);

但是,如果要限制字符串中的字符,请修改if条件以检查所需的字符。而不是:

if(Character.isDigit(expression[pos]) || expression[pos] == '.') {

将其更改为仅接受您认为在字符串中有效的字符。例如

if (Character.isAlphabetic(expression[pos]) || expression[pos] == "-") {

您可能还需要新的标识符而不是

type = Type.LITTER;

答案 1 :(得分:0)

我不确定你想怎么处理&#34; A12&#34;或任何部分alpha和部分数字的东西。此代码将返回&#34; A&#34;为&#34; A12&#34;

package lexical;

public class Tokenizer {

    int pos;
    char[] expression;

    Tokenizer(String expression) {
        this.expression = expression.toCharArray();
        this.pos = 0;
    }

    enum Type {

        OPERATOR, ALPHA, UNKNOWN, LITTER, ERROR
    }

    class Lexeme {

        String type, token;

        Lexeme(String type, String token) {
            this.type = type;
            this.token = token;
        }
    }

     Lexeme getNextToken() {
        StringBuilder token = new StringBuilder();
        boolean endOfToken = false;
        Type type = Type.UNKNOWN;
        // skip whitespace
        while (expression[pos] == ' ' && hasMoreTokens()) {
            pos++;
        } 

        while (!endOfToken && hasMoreTokens()) {

            switch (expression[pos]) {
                case ' ':
                {
                    endOfToken = true;
                    pos++;
                    break;
                }
                default:
                    if (Character.isAlphabetic(expression[pos])) {                     
                        token.append(expression[pos]);
                        type = Type.ALPHA;                      
                    } else {
                        System.out.println("Systax error at position: " + pos);
                    }
                    pos++;
                    break;
            }
        }
        return new Lexeme(type.name().toLowerCase(), token.toString());
    }

    boolean hasMoreTokens() {
        return pos < expression.length;
    }

    public static void main(String[] args) {
        expression = "Hello World";
        Tokenizer tokenizer = new Tokenizer(expression);
        while (tokenizer.hasMoreTokens()) {
            Lexeme nextToken = tokenizer.getNextToken();
            System.out.print("Type: " + nextToken.type + "\tLexeme: " + nextToken.token + "\n");
        }        
        expression = "123 ABC A12";
        tokenizer = new Tokenizer(expression);
        while (tokenizer.hasMoreTokens()) {
            Lexeme nextToken = tokenizer.getNextToken();
            System.out.print("Type: " + nextToken.type + "\tLexeme: " + nextToken.token + "\n");
        }        
    }
}

输出:

Type: alpha Lexeme: Hello
Type: alpha Lexeme: World
Systax error at position: 0
Systax error at position: 1
Systax error at position: 2
Type: unknown   Lexeme: 
Type: alpha Lexeme: ABC
Systax error at position: 9
Systax error at position: 10
Type: alpha Lexeme: A
相关问题