使用正则表达式匹配第n个字符串

时间:2018-04-09 16:24:18

标签: regex awk sed grep notepad++

我想在行中提取匹配的文本,但只能从与模式匹配的行中提取。这里所需的模式类似于^[a-zA-Z],[a-zA-Z]。也就是说,我想从一行开头提取文本,其中有两个单词用逗号分隔。以下是一些示例文本:

Acanthus,mollis,,Bears Breaches,X,N,,,,Australian
Naturalised and/or Noxious Taxa
Acanthus,mollis, ,Bears Breach,,,,”Dispersal:
Vegetative. Life Form: Perennial herb. RISK: Potential threat to one or
vegetation formations (Victoria). Vegetation Formations Invaded:
1,8”,”Introduced deliberately from: Eur and Commercially
Available, , In Victoria: Rare or localised, small

因此,所需的输出将是

Acanthus,mollis
Acanthus,mollis

到目前为止,我得到的最接近的是^.+?(?=,{2})给出,例如上面的例子:

Acanthus,mollis
Acanthus,mollis, ,Bears Breach

2 个答案:

答案 0 :(得分:0)

如果您的sed支持ERE的-E

$ sed -En 's/^([[:alpha:]]+,[[:alpha:]]+).*/\1/p' file
Acanthus,mollis
Acanthus,mollis

或任何sed:

$ sed -n 's/^\([[:alpha:]][[:alpha:]]*,[[:alpha:]][[:alpha:]]*\).*/\1/p' file
Acanthus,mollis
Acanthus,mollis

答案 1 :(得分:0)

关注<Button@Button>: font_size: 15 font_name: 'Verdana' <TextInput@TextInput>: font_size: 15 font_name: 'Verdana' padding_y: 3 <Row>: size_hint_y: None height: self.minimum_height height: 40 Button: text: root.button_text size_hint_x: None top: 200 TextInput: id:test1 text: ' ' width: 300 multiline: False on_text_validate: test2.focus = True TextInput: id:test2 text: ' ' width: 300 multiline: False on_text_validate: app.root.add_more() <Rows>: size_hint_y: None height: self.minimum_height orientation: "vertical" User: BoxLayout: orientation: "vertical" GridLayout: cols: 2 padding: 20, 20 spacing: 10, 10 Label: text: "Name" text_size: self.size valign: 'middle' TextInput: id:name multiline: False text_size: self.size ScrollView: Rows: id: rows 也可以帮助您。

awk