任何Framework函数都有助于找到多个字符串中最长的公共起始子字符串?

时间:2010-09-21 13:17:16

标签: c#

我有一个字符串列表(代表路径和),它们都应该有一个共同的开头(根路径)。我需要得到那个共同的开端。

这只是几条线,但我有一种唠叨的感觉,这一定必须每年抛出一百万次,并且框架中可能有一个算法可用于此,但不能'找到一些东西。
另外,我想之前已经问过这个问题了,但我干了。

任何提示?

7 个答案:

答案 0 :(得分:2)

如果有人有兴趣,这就是我想出的:

    public static string GetCommonStartingSubString(IList<string> strings)
    {
        if (strings.Count == 0)
            return "";
        if (strings.Count == 1)
            return strings[0];
        int charIdx = 0;
        while (IsCommonChar(strings, charIdx))
            ++charIdx;
        return strings[0].Substring(0, charIdx);
    }
    private static bool IsCommonChar(IList<string> strings, int charIdx)
    {
        if(strings[0].Length <= charIdx)
            return false;
        for (int strIdx = 1; strIdx < strings.Count; ++strIdx)
            if (strings[strIdx].Length <= charIdx 
             || strings[strIdx][charIdx] != strings[0][charIdx])
                return false;
        return true;
    }

答案 1 :(得分:1)

此方法应该有效:

string GetLongestCommonPrefix(IEnumerable<string> items)
{
    return items.Aggregate(default(string), GetLongestCommonPrefix);
}

string GetLongestCommonPrefix(string s1, string s2)
{
    if (s1 == null || s2 == null)
        return s1 ?? s2;

    int n = Math.Min(s1.Length, s2.Length);
    int i;
    for (i = 0; i < n; i++)
    {
        if (s1[i] != s2[i])
            break;
    }
    return s1.Substring(0, i);
}

答案 2 :(得分:1)

请原谅我的普通变量命名,并且速度不是很快,但这应该是:

// your list of strings...
List<string> strings;    

string shortestString = strings.First(x => x.Length == 
    strings.Select(y => y.Length).Min());
while (!strings.All(s => s.StartsWith(shortestString)))
{
    shortestString = shortestString.Substring(0, shortestString.Length - 1);
}

答案 3 :(得分:0)

简化实现的一个想法是只编写一个方法来获取两个字符串的最长子字符串,然后使用LINQ中的Aggregate方法。类似的东西:

strings.Skip(1).Aggregate(strings.First(), GetCommonSubString);

我认为使用标准方法来处理字符串时,没有任何优雅的方法来实现GetCommonSubstring。如果您关心性能,那么您可能必须以“直接”方式实现它。使用LINQ的较慢但较短的替代方案可能如下所示:

var chars = 
  str1.Zip(str2, (c1, c2) => new { Match = c1 == c2, Char = c1 })
      .TakeWhile(c => c.Match).Select(c => c.Char).ToArray();
return new string(chars);

这首先“拉开”两个字符串,然后使用TakeWhile获取字符相同的部分。其余的生成一个字符数组,可用于创建带有结果的字符串。

答案 4 :(得分:0)

也许我过分简化了你的问题,但

呢?

<击> var rootPath = paths.Select(s => new {path = s, depth = s.Split('\\').Length}). Aggregate((memo, curr) => curr.depth < memo.depth ? curr : memo).path;

绝望,最可能很慢,而且周围都非常愚蠢的尝试

var paths = new List<string> { @"C:\Ruby19\lib\ruby\gems",
                               @"C:\Ruby19\lib\ruby\gems\1.9.2",
                               @"C:\Ruby19\lib\ruby\gems",
                               @"C:\Ruby19\lib\test\fest\hest"};

var rootPath = paths.Select(s => new { p = s.Split('\\') })
                    .Aggregate((memo, curr) => new { p = curr.p.TakeWhile((stp, ind) => stp == memo.p.ElementAtOrDefault(ind)).ToArray() })
                    .p.Join("\\");

=&GT; rootPath =“C:\ Ruby19 \ lib”

答案 5 :(得分:0)

前段时间我遇到了同样的问题(就像许多其他人一样)。这是我提出的解决方案。我没有进行任何性能测量,但我对100个元素的列表没有任何问题。

using System;
using System.Collections.Generic;
using System.Linq;

namespace FEV.TOPexpert.Common.Extensions
{
    public static class IEnumerableOfStringExtension
    {
        /// <summary>
        /// Finds the most common left string in a sequence of strings.
        /// </summary>
        /// <param name="source">The sequence to search in.</param>
        /// <returns>The most common left string in the sequence.</returns>
        public static string MostCommonLeftString(this IEnumerable<string> source)
        {
            return source.MostCommonLeftString(StringComparison.InvariantCulture);
        }

        /// <summary>
        /// Finds the most common left string in a sequence of strings.
        /// </summary>
        /// <param name="source">The sequence to search in.</param>
        /// <param name="comparisonType">Type of the comparison.</param>
        /// <returns>The most common left string in the sequence.</returns>
        public static string MostCommonLeftString(this IEnumerable<string> source, StringComparison comparisonType)
        {
            if (source == null)
                throw new ArgumentNullException("source");

            string mcs = String.Empty;

            using (var e = source.GetEnumerator())
            {
                if (!e.MoveNext())
                    return mcs;

                mcs = e.Current;
                while (e.MoveNext())
                    mcs = mcs.MostCommonLeftString(e.Current, comparisonType);
            }
            return mcs;
        }

        /// <summary>
        /// Returns a sequence with the most common left strings from a sequence of strings.
        /// </summary>
        /// <param name="source">A sequence of string to search through.</param>
        /// <returns>A sequence of the most common left strings ordered in descending order.</returns>
        public static IEnumerable<string> MostCommonLeftStrings(this IEnumerable<string> source)
        {
            return MostCommonLeftStrings(source, StringComparison.InvariantCulture);
        }

        /// <summary>
        /// Returns a sequence with the most common left strings from a sequence of strings.
        /// </summary>
        /// <param name="source">A sequence of string to search through.</param>
        /// <param name="comparisonType">Type of comparison.</param>
        /// <returns>A sequence of the most common left strings ordered in descending order.</returns>
        public static IEnumerable<string> MostCommonLeftStrings(this IEnumerable<string> source, StringComparison comparisonType)
        {
            if (source == null)
                throw new ArgumentNullException("source");

            var listOfMcs = new List<string>();

            using (var e = source.GetEnumerator())
            {
                while (e.MoveNext())
                {
                    if (e.Current == null)
                        continue;

                    string removeFromList = String.Empty;
                    string addToList = String.Empty;

                    foreach (var element in listOfMcs)
                    {
                        addToList = e.Current.MostCommonLeftString(element, comparisonType);

                        if (addToList.Length > 0)
                        {
                            removeFromList = element;
                            break;
                        }
                    }

                    if (removeFromList.Length <= 0)
                    {
                        listOfMcs.Add(e.Current);
                        continue;
                    }

                    if (addToList != removeFromList)
                    {
                        listOfMcs.Remove(removeFromList);
                        listOfMcs.Add(addToList);
                    }
                }
            }

            return listOfMcs.OrderByDescending(item => item.Length);
        }

        /// <summary>
        /// Returns a string that both strings have in common started from the left.
        /// </summary>
        /// <param name="first">The first string.</param>
        /// <param name="second">The second string.</param>
        /// <returns>Returns a string that both strings have in common started from the left.</returns>
        public static string MostCommonLeftString(this string first, string second)
        {
            return first.MostCommonLeftString(second, StringComparison.InvariantCulture);
        }

        /// <summary>
        /// Returns a string that both strings have in common started from the left.
        /// </summary>
        /// <param name="first">The first string.</param>
        /// <param name="second">The second string.</param>
        /// <param name="comparisonType">Type of comparison.</param>
        /// <returns>Returns a string that both strings have in common started from the left.</returns>
        public static string MostCommonLeftString(this string first, string second, StringComparison comparisonType)
        {
            if (first == null
                || second == null)
                return null;

            int length = Math.Min(first.Length, second.Length);
            first = first.Substring(0, length);
            second = second.Substring(0, length);

            while (!first.Equals(second, comparisonType))
            {
                first = first.Substring(0, first.Length - 1);
                second = second.Substring(0, second.Length - 1);
            }

            return first;
        }

        private static bool MatchesWithList(string match, IList<string> elements, StringComparison comparisonType)
        {
            string removeFromList = String.Empty;
            string addToList = String.Empty;

            foreach (var element in elements)
            {
                addToList = match.MostCommonLeftString(element, comparisonType);

                if (addToList.Length > 0)
                {
                    removeFromList = element;
                }
            }

            if (removeFromList.Length > 0)
            {
                if (addToList != removeFromList)
                {
                    elements.Remove(removeFromList);
                    elements.Add(addToList);
                }
                return true;
            }

            return false;
        }
    }
}

答案 6 :(得分:0)

以下内容返回任意IEnumerable<T>集合的最长公共前缀,而不仅仅是字符串。

public static bool Same<T>(this IEnumerable<T> xs) {
  return !xs.Any() || !xs.Skip(!xs.Skip(1).All(x => x.Equals(xs.First()));
}

public static IEnumerable<T> CommonPrefix<T>(this IEnumerable<IEnumerable<T>> xss) {
  var r = new List<T>();
  var es = xss.Select(x => x.GetEnumerator()).ToList();
  while (es.Select(x => x.MoveNext()).All(x => x))
     if (!es.Select(x => x.Current).Same())
        return r;
     return r;
  }
}