从Perl中的字符串中提取子字符串

时间:2019-02-01 11:36:00

标签: regex perl

我有一个如下所示的字符串:

downCircuit received;TOKENS START;{"action":'"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END

我想从中提取link_index的值。即在这种情况下,输出应为101。有人可以帮忙从我的琴弦中提取101吗。

3 个答案:

答案 0 :(得分:2)

  

我有一个像下面这样的字符串

您拥有的JSON前后都有一些多余的东西。因此,最好不要提取正则表达式,而最好是提取实际的JSON,然后使用a JSON parser来处理它。像这样:

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

use JSON;

my $input = 'downCircuit received;TOKENS START;{"action":"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END';

$input =~ s/.*START;//;
$input =~ s/;TOKENS END//;

my $data = JSON->new->decode($input);

say $data->{link_index};

如预期的那样,这将产生输出101

注意:我认为您的问题中有错别字。至少,JSON中存在语法错误。我删除了"UPDATE"前一个不匹配的引号字符。

答案 1 :(得分:0)

您可以使用一个简单的正则表达式,如下所示:

"link_index":"(\d+)"

然后从捕获组中获取内容

Working demo

my $str = 'downCircuit received;TOKENS START;{"action":\'"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END';
my $regex = qr/"link_index":"(\d+)"/mp;

if ( $str =~ /$regex/g ) {
  print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
  print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
  # print "Capture Group 2 is $2 ... and so on\n";
}

答案 2 :(得分:0)

您可以使用反向引用:

print $1,"\n" if /"link_index":"(\d+)"/

全文:

$string=q(downCircuit received;TOKENS START;{"action":'"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END);
print $1,"\n" if $string =~ /"link_index":"(\d+)"/;