在空白之前解析文本Java正则表达式匹配,并在空格之后编号以生成CSV

时间:2017-10-02 15:40:51

标签: java regex

目前我正在使用这个简单的正则表达式:

[^\s]

我在these docs的帮助下拼凑了一起。

它可以获取以下信息:

enter image description here

然而,完整的数据集如下所示:

#### LOGS ####
CONSOLE:
makePush            2196
makePush            638
makePush            470
opAdd           8342
opAdd           288
opStop          133
0x
DEBUG:
#### TRACE ####
PUSH32          pc=00000000 gas=10000000000 cost=3

PUSH32          pc=00000033 gas=9999999997 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000005

PUSH32          pc=00000066 gas=9999999994 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000005
00000001  0000000000000000000000000000000000000000000000000000000000000005

ADD             pc=00000099 gas=9999999991 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000005
00000001  0000000000000000000000000000000000000000000000000000000000000005
00000002  0000000000000000000000000000000000000000000000000000000000000005

ADD             pc=00000100 gas=9999999988 cost=3
Stack:
00000000  000000000000000000000000000000000000000000000000000000000000000a
00000001  0000000000000000000000000000000000000000000000000000000000000005

STOP            pc=00000101 gas=9999999985 cost=0
Stack:
00000000  000000000000000000000000000000000000000000000000000000000000000f

最后,我需要将结果看起来像这样:

makePush, 2196
makePush, 638
makePush, 470
opAdd, 8342
opAdd, 288
opStop, 133

我提供的regex肯定不足以捕获它。

我想做的是:

  • 忽略输入中没有makePush 2196

  • 形式的字符串
  • 对于上述形式的行......

    • 将其拆分为三组"

      first wordwhitespacesecond word

  • 最后我要保存表单的csv:

    first wordsecond word

2 个答案:

答案 0 :(得分:1)

试试这个?

/([a-zA-Z]+)[\t ]+(\d+)/g

,其中

  • ([a-zA-Z]+)匹配单个字词
  • [\t ]+匹配水平空格
  • (\d+)匹配数字文字

答案 1 :(得分:0)

试试这个(来自Pshemo的想法,但使用\ w +)

        Pattern pattern = Pattern.compile("^(\\w+)\\s+(\\d+)$");
        Matcher matcher = pattern.matcher(str);
        while (matcher.find())
        {
            System.out.println(matcher.group(1)+", "+matcher.group(2));
        }