正则表达式匹配不包括空格

时间:2011-12-15 19:16:13

标签: c# regex regex-group

我有这个正则表达式:

(?'box_id'\d{1,19})","box_name":"(?'box_name'[\w\d\.\s]{1,19})

除非框名称包含空格,否则此方法很有效。例如,在my box上执行时,它返回mybox,没有空格。

如何在box_name组中添加空格?

代码:

Regex reg = new Regex(@"""object_id"":""(?<object_id>\d{1,19})"",""file_name"":""(?<file_name>[\w.]+(?:\s[\w.]+)*)""");
MatchCollection matches = reg.Matches(result);
if ( matches == null) throw new Exception("There was an error while parsing data."); 
if ( matches.Count > 0 )
{
  FileArchive.FilesDataTable filesdataTable = new FileArchive.FilesDataTable();
  foreach ( Match match in matches )
  {
    FileArchive.FilesRow row = filesdataTable.NewFilesRow();
    row.ID = match.Groups["object_id"].Value;
    row.Name = match.Groups["file_name"].Value;
  }
}

输入:

  

{“objects”:[{“object_id”:“135248”,“file_name”:“some space here.jpg”,“video_status”:“0”,“thumbnail_status”:“1”},{“object_id “:”135257“,”file_name“:”jup 13.jpg“,”video_status“:”0“,”thumbnail_status“:”1“},{”object_id“:”135260“,”file_name“:”我的照片.JPG “ ”video_status“: ”0“, ”thumbnail_status“: ”1“},{ ”的object_id“: ”135262“, ”FILE_NAME“: ”EveningWav)ES,汉(olulu,Hawaii.jpg“,” video_status “:”0“,”thumbnail_status“:”1“},{”object_id“:”135280“,”file_name“:”test with spaces.jpg“,”video_status“:”0“,”thumbnail_status“:”1 “}],” 状态 “:” OK“}

2 个答案:

答案 0 :(得分:1)

在我看来,您的数据始终是双引号分隔,不是吗?这个事实应该是正则表达式的基础:

(?<box_id>\d{1,19})","file_name":"(?<box_name>[^"]{1,19})  //1 to 19 non " chars.

至于缺少空格,此标记(?'box_name'[\ w \ d。\ s] {1,19})无法匹配包含“我的盒子”的字符串上的“mybox”,因此该问题必须在下游。

错别字和样式:你有文字'box_name'但令牌是'file_name'。此外,为什么在世界上你会切换到使用单引号作为命名组分隔符&lt;&gt;括号,默认值,更具可读性(因为引号在正则表达式中!)

答案 1 :(得分:0)

除了@ sweaver2112所说的,我认为你需要通过添加引号来扩展框架并摆脱{1,19}范围。

这些正则表达式在Perl中工作,我不想用C#来测试它。

"(?<box_id>\d+)","(?:${type})":"(?<box_name>[\w.]+(?:\s[\w.]+)*)"
或者,
"\s*(?<box_id>\d+)\s*","\s*(?:${type})\s*":"\s*(?<box_name>[\w.]+(?:\s[\w.]+)*)\s*"
其中$ type =&#39; file_name&#39 ;;

但实际上,这也应该有效(类型被替换)。它的验证很放松 "(?<box_id>\d+)","file_name":"(?<box_name>[^"]*)"

修改

&#34;不确定,我的正则表达式给了你什么? - 昨天sl 它返回了正确的结果,在我的问题的输入中,我得到了“somespacehere.jpg”#39; &#39; jup13.jpg&#39;等等为file_name组。 - 昨天NET Developer&#34;

我接受了你的代码和输入,只是打印组,它完美无缺。空间在那里,
将某些内容分配给您的ROW数据时,必定会出现问题。

在此处查看http://www.ideone.com/HsTMF

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = @"{""objects"":[{""object_id"":""135248"",""file_name"":""some space here.jpg"",""video_status"":""0"",""thumbnail_status"":""1""},{""object_id"":""135257"",""file_name"":""jup 13.jpg"",""video_status"":""0"",""thumbnail_status"":""1""},{""object_id"":""135260"",""file_name"":""my pic.jpg"",""video_status"":""0"",""thumbnail_status"":""1""},{""object_id"":""135262"",""file_name"":""EveningWav)es,Hon(olulu,Hawaii.jpg"",""video_status"":""0"",""thumbnail_status"":""1""},{""object_id"":""135280"",""file_name"":""test with spaces.jpg"",""video_status"":""0"",""thumbnail_status"":""1""}],""status"":""ok""}";
      Regex reg = new Regex(
                   @"""object_id"":""(?<object_id>\d{1,19})"",""file_name"":""(?<file_name>[\w.]+(?:\s[\w.]+)*)"""
      );
      foreach ( Match match in reg.Matches(input) )
         Console.WriteLine(
                 "Id = '{0}',  File name = '{1}'", 
                 match.Groups["object_id"].Value,
                 match.Groups["file_name"].Value  );
   }
}

输出:

Id = '135248',  File name = 'some space here.jpg'
Id = '135257',  File name = 'jup 13.jpg'
Id = '135260',  File name = 'my pic.jpg'
Id = '135280',  File name = 'test with spaces.jpg'