C#文本文件和正则表达式

时间:2011-03-15 12:45:04

标签: c# regex file text

我似乎遇到以下文件的问题:

*User Type  0:        Database Administrator
Users of this Type:
                     Database Administrator          DBA         Can Authorise:Y     Administrator:Y
                     DM3 Admin Account               DM3         Can Authorise:Y     Administrator:Y
Permissions for these users:
Data - Currencies                                  Parameters - Database                                Add FRA Deal                                     Reports - Confirmation Production
  Add Currency                                       Amend Database Parameters                          Cancel FRA Deal                                  Reports - System Printer Definitions
  Delete Currency                                  Parameters - Data Retention                          Amend FRA Deal                                     Save System Printers
  Amend Currency                                     Amend Data Retention Parameters                    Amend Settlements Only                           Custom Confs/Tickets
  Amend Currency Rates                             Data - Rate References                               Verify FRA Deal                                    Add Custom Confs/Tickets
  Amend Currency Holidays                            Add Rate Reference                                 Add FRA Deal (Restricted)                          Delete Custom Confs/Tickets
  Add Nostro                                         Delete Rate Reference                              Release FRA Deal                                   Amend Custom Confs/Tickets
  Amend Nostro                                       Amend Rate Reference                             Deal - IRS                                         Reports - System Report Batches
  Delete Nostro                                    Deal - Call Accounts                                 Add IRS Deal                                       Save System Batches
Data - Currency Pairs                                Open Call Account                                  Cancel IRS Deal                                  Reports - View Reports Spooled
  Add Currency Pair                                  Amend Call Account                                 Amend IRS Deal                                   View - Audits
  Delete Currency Pair                               Close Call Account                                 Amend Settlements Only                             Print Audit
  Amend Currency Pair                                Amend Settlements Only                             Verify IRS Deal                                    Print Audit Detail
Data - Books                                       Data - Sales Relationship Mgrs                       Add IRS Deal (Restricted)                          Filter Audit*

我使用正则表达式检查每一行的模式。总共有三种模式需要匹配。如果查看前三行,那就是需要从文件中获取的所有信息。我遇到的问题是我的正则表达式不匹配。另外需要做的是需要从两行之间获取信息......我该怎么做?

这是我到目前为止的代码:

        string path = @"C:/User Permissions.txt";
        string t = File.ReadAllText(path);

        //Uses regular expression check to match the specified string pattern
        string pattern1 = @"User Type ";
        string pattern2 = @"Users of this Type:";
        string pattern3 = @"Permissions for these users:";
        Regex rgx1 = new Regex(pattern1);
        Regex rgx2 = new Regex(pattern2);
        Regex rgx3 = new Regex(pattern3);

        MatchCollection matches = rgx1.Matches(t);
        List<string[]> test = new List<string[]>();

        foreach (var match in matches)
        {
            string[] newString = match.ToString().Split(new string[] { @"User Type ", }, StringSplitOptions.RemoveEmptyEntries);

            for (int i = 3; i <= newString.Length; i++)
            {
                test.Add(new string[] { newString[0], newString[1], newString[i - 1] });
            }

        }

        MatchCollection matches2 = rgx2.Matches(t);
        List<string[]> test2 = new List<string[]>();

        foreach (var match2 in matches2)
        {
            string[] newString = match2.ToString().Split(new string[] { @"Permissions for these users: ", }, StringSplitOptions.RemoveEmptyEntries);

            for (int i = 3; i <= newString.Length; i++)
            {
                test2.Add(new string[] { newString[0], newString[1], newString[i - 1] });
            }

        }

        MatchCollection matches3 = rgx3.Matches(t);
        List<string[]> test3 = new List<string[]>();

        foreach (var match3 in matches3)
        {
            string[] newString = match3.ToString().Split(new string[] { @"Users of this Type: ", }, StringSplitOptions.RemoveEmptyEntries);

            for (int i = 3; i <= newString.Length; i++)
            {
                test3.Add(new string[] { newString[0], newString[1], newString[i - 1] });
            }

        }
        foreach (var line in test)
        {
            Console.WriteLine(line[0]);
            Console.ReadLine();
        }
        Console.ReadLine();

与我的相比,Guffa的代码似乎非常有效,我现在唯一的问题是如何提取“这类用户”和“这些用户的权限”之间的界限。“怎么会这样做?显然检查到看看名称是否以新行开头也无济于事。

2 个答案:

答案 0 :(得分:0)

你不会成功使用reg-exp从这个txt转储中提取你想要的数据(如果不投入太多精力,很难使用任何其他技术)。

使用正则表达式最重要的障碍是我可以看到信息实际上是在整个txt文件的中列出的。

问题最好通过类别来说明 Data - Sales Relationship Mgrs 在一列中,而该类别的所有权限都在下一列中。

请调查是否可以通过其他方式获取此信息。

尽管如此,这是一个粗略的算法策略,用于处理文件:

  1. 逐行阅读文件
  2. 查看您感兴趣的信息的预定义偏移量。
  3. 当您获得堆积在列中的信息时,您可以在解析每一行时临时将每列附加到单独的集合
  4. 最后尝试从所有临时列的串联中提取特权。

答案 1 :(得分:0)

不,你没有检查每一行的模式,你正在寻找整个文件中的模式作为单个字符串,你只得到匹配的确切文本,所以当你拆分每个结果时你最终得到一个包含两个空字符串的数组。

如果我理解正确,每一行都包含一个键和一个值,因此使用正则表达式并没有任何意义。只需遍历这些行并比较字符串。

这是一个开始:

string[] lines = @"C:/User Permissions.txt"; string t = File.ReadAllLines(path);
foreach (string line in lines) {
  if (line.StartsWith("User Type ") {
    Console.WriteLine("User type:" + line.Substring(10));
  } else if (line.StartsWith("Users of this Type:") {
    Console.WriteLine("Users:" + line.Substring(19));
  } else if (line.StartsWith("Permissions for these users:") {
    Console.WriteLine("Permissions:" + line.Substring(28));
  }
}

编辑:

以下是如何使用常规循环而不是foreach,以便您可以使用读取行的内部循环:

string[] lines = @"C:/User Permissions.txt"; string t = File.ReadAllLines(path);
int line = 0;
while (line < lines.Length) {
  if (lines[line].StartsWith("User Type ") {
    Console.WriteLine("User type:" + lines[line].Substring(10));
  } else if (lines[line].StartsWith("Users of this Type:") {
    line++;
    while (line < lines.Length && !lines[line].StartsWith("Permissions for these users:")) {
      Console.WriteLine("User: " + lines[line]);
      line++;
    }
  } else if (lines[line].StartsWith("Permissions for these users:") {
    Console.WriteLine("Permissions:" + lines[line].Substring(28));
  }
  line++;
}