为正则表达式生成所有匹配项

时间:2014-02-02 23:00:57

标签: c# regex recursion fare

对于用户选择,我想提供与给定正则表达式匹配的数字列表。正则表达式本身非常简单,它只能看起来像 123 [0-9] [0-9] [4-9] 34.2

我发现票价https://github.com/moodmosaic/Fare)正在以某种方式完成工作。请参阅以下示例:

string pattern = "123[0-9][0-9]";
var xeger = new Xeger(pattern);
var match = xeger.Generate(); //match is e.g. "12349"

不幸的是,Fare-lib只给了我一个可能的匹配,但并不是所有100种可能的组合。

如果您认为正则表达式在这种情况下不是一个好主意而宁愿建议一个替换字符的for循环实现我也会这样做,但目前我不知道如何?也许递归函数会很聪明?

2 个答案:

答案 0 :(得分:1)

我宁愿创建自己的实现而不是使用库。以下代码执行您想要实现的目标。

 private static Regex regexRegex = new Regex("\\[(?<From>\\d)-(?<To>\\d)]", RegexOptions.Compiled);

    private static IEnumerable<string> GetStringsForRegex(string pattern)
    {
        var strings = Enumerable.Repeat("", 1);
        var lastIndex = 0;
        foreach (Match m in regexRegex.Matches(pattern))
        {
            if (m.Index > lastIndex)
            {
                var capturedLastIndex = lastIndex;
                strings = strings.Select(s => s + pattern.Substring(capturedLastIndex, m.Index - capturedLastIndex));
            }
            int from = int.Parse(m.Groups["From"].Value);
            int to = int.Parse(m.Groups["To"].Value);
            if (from > to)
            {
                throw new InvalidOperationException();
            }
            strings = strings.SelectMany(s => Enumerable.Range(from, to - from + 1), (s, i) => s + i.ToString());
            lastIndex = m.Index + m.Length;
        }
        if (lastIndex < pattern.Length)
        {
             var capturedLastIndex = lastIndex;
             strings = strings.Select(s => s + pattern.Substring(capturedLastIndex));
        }
        return strings;
    }

基本上,代码构造了正则表达式模式的所有解决方案。它甚至按字母顺序计算它们。

小心capturedLastIndex变量。这是必需的,否则编译器会捕获lastIndex变量,从而导致不正确的结果。

答案 1 :(得分:0)

这是我现在为我工作的代码。不是非常通用,只有字符串中的两个可能的表达式,但它的工作原理;)

        List<string> possibleMatches = new List<string>();

        string expression = "123?[3-9]" //this is an example

        if (expression.Contains("?") || expression.Contains("["))
        {               
            int count = expression.Count(f => f == '?');
            count += expression.Count(f => f == '[');
            if (count <= 2)
            {
                string pattern = expression.Replace('*', '.').Replace('?', '.');
                pattern = pattern.Replace(".", "[0-9]"); 

                int pos1 = pattern.IndexOf('[');
                int start1 = Convert.ToInt32(pattern[pos1 + 1].ToString());
                int end1 = Convert.ToInt32(pattern[pos1 + 3].ToString());

                int pos2 = 0;
                int start2, end2 = 0;
                if (count > 1)
                {
                    pos2 = pattern.IndexOf('[', pos1);
                    start2 = Convert.ToInt32(pattern[pos2 + 1].ToString());
                    end2 = Convert.ToInt32(pattern[pos2 + 3].ToString());

                    pattern = pattern.Remove(pos1, "[0-9]".Length);
                    pattern = pattern.Remove(pos2, "[0-9]".Length);

                    string copy = pattern;
                    for (int i = start1; i <= end1; i++)
                    {
                        copy = pattern.Insert(pos1, i.ToString());
                        for (int y = start2; y <= end2; y++)
                        {
                            possibleMatches.Add(copy.Insert(pos2, y.ToString()));
                        }
                    }
                }
                else
                {
                    pattern = pattern.Remove(pos1, "[0-9]".Length);

                    for (int i = start1; i <= end1; i++)
                    {
                        possibleMatches.Add(pattern.Insert(pos1, i.ToString()));
                    }
                }
            }
相关问题