如何使Regex仅捕获命名组

时间:2015-04-01 18:42:45

标签: c# regex

根据Regex文档,使用RegexOptions.ExplicitCapture使正则表达式仅匹配(?<groupName>...)之类的命名组;但在行动中,它做了一些不同的事情。

考虑以下几行代码:

static void Main(string[] args) {
    Regex r = new Regex(
        @"(?<code>^(?<l1>[\d]{2})/(?<l2>[\d]{3})/(?<l3>[\d]{2})$|^(?<l1>[\d]{2})/(?<l2>[\d]{3})$|(?<l1>^[\d]{2}$))"
        , RegexOptions.ExplicitCapture
    );
    var x = r.Match("32/123/03");
    r.GetGroupNames().ToList().ForEach(gn => {
        Console.WriteLine("GroupName:{0,5} --> Value: {1}", gn, x.Groups[gn].Success ? x.Groups[gn].Value : "");
    });
}

当您运行此代码段时,您会看到结果中包含名为 0 的组,而我的正则表达式中没有名为0的组!

GroupName:    0 --> Value: 32/123/03  
GroupName: code --> Value: 32/123/03  
GroupName:   l1 --> Value: 32  
GroupName:   l2 --> Value: 123  
GroupName:   l3 --> Value: 03  
Press any key to continue . . .  

有人可以向我解释一下这种行为吗?

2 个答案:

答案 0 :(得分:1)

总是有0组:这是整个比赛。编号组相对于1,基于定义组的左括号的序数位置。您的正则表达式(为清晰起见而格式化):

(?<code>
  ^
  (?<l1> [\d]{2} )
  /
  (?<l2> [\d]{3} )
  /
  (?<l3> [\d]{2} )
  $
|
  ^
  (?<l1>[\d]{2})
  /
  (?<l2>[\d]{3})
  $
|
   (?<l1> ^[\d]{2} $ )
)

您的表达式将会回溯,因此您可以考虑简化正则表达式。这可能更清晰,更有效:

static Regex rxCode = new Regex(@"
  ^                    # match start-of-line, followed by
  (?<code>             # a mandatory group ('code'), consisting of
    (?<g1> \d\d )      # - 2 decimal digits ('g1'), followed by
    (                  # - an optional group, consisting of
      /                #   - a literal '/', followed by
      (?<g2> \d\d\d )  #   - 3 decimal digits ('g2'), followed by
      (                #   - an optional group, consisting of
        /              #     - a literal '/', followed by
        (?<g3> \d\d )  #     - 2 decimal digits ('g3')
      )?               #     - END: optional group
    )?                 #   - END: optional group
  )                    # - END: named group ('code'), followed by
  $                    # - end-of-line
" , RegexOptions.IgnorePatternWhitespace|RegexOptions.ExplicitCapture );

有了这样的话,就像这样:

string[] texts = { "12" , "12/345" , "12/345/67" , } ;

foreach ( string text in texts )
{
  Match m = rxCode.Match( text ) ;
  Console.WriteLine("{0}: match was {1}" , text , m.Success ? "successful" : "NOT successful" ) ;
  if ( m.Success )
  {
    Console.WriteLine( "  code: {0}" , m.Groups["code"].Value ) ;
    Console.WriteLine( "  g1: {0}" , m.Groups["g1"].Value ) ;
    Console.WriteLine( "  g2: {0}" , m.Groups["g2"].Value ) ;
    Console.WriteLine( "  g3: {0}" , m.Groups["g3"].Value ) ;
  }
}

产生预期的

12: match was successful
  code: 12
  g1: 12
  g2:
  g3:
12/345: match was successful
  code: 12/345
  g1: 12
  g2: 345
  g3:
12/345/67: match was successful
  code: 12/345/67
  g1: 12
  g2: 345
  g3: 67

答案 1 :(得分:0)

命名为

^(?<l1>[\d]{2})/(?<l2>[\d]{3})/(?<l3>[\d]{2})$|^(?<l1>[\d]{2})/(?<l2>[\d]{3})$|(?<l1>^[\d]{2}$)

enter image description here

试试这个(我从正则表达式中移除第一个组) - see demo