最长的共同子序列

时间:2013-11-14 17:24:56

标签: c# string matrix backtracking lcs

嗨,这是我的代码,用于c#中2个字符串的最长公共子序列。我需要帮助回溯。我需要找出子序列:GTCGT

String str1 = "GTCGTTCG";
String str2 = "ACCGGTCGAGTG";

int[,] l = new int[str1.Length, str2.Length]; // String 1 length and string 2      length storing it in a 2-dimensional array
int lcs = -1;
string substr = string.Empty;
int end = -1;

for (int i = 0; i <str1.Length ; i++) // Looping based on string1 length 
{                
  for (int j = 0; j < str2.Length; j++) // Looping based on string2 Length
  {                  
    if (str1[i] == str2[j]) // if match found 
    {
      if (i == 0 || j == 0)  // i is first element or j is first elemnt then array [i,j] = 1
      {
        l[i, j] = 1;
      }
      else
      {   
        l[i, j] = l[i - 1, j - 1] + 1; // fetch the upper value and increment by 1 
      }

      if (l[i, j] > lcs)
      {
        lcs = l[i, j]; // store lcs value - how many time lcs is found 
        end = i; // index on longest continuous string
      }

    }
    else // if match not found store zero initialze the array value by zero
    {
      l[i, j] = 0;
    }
}

1 个答案:

答案 0 :(得分:0)

您的函数需要返回一个字符串集合。可能有几个长度相同的最长的公共子序列。

public List<string> LCS(string firstString, string secondString)
{
    // to create the lcs table easier which has first row and column empty.
    string firstStringTemp = " " + firstString;
    string secondStringTemp = " " + secondString;

    // create the table
    List<string>[,] temp = new List<string>[firstStringTemp.Length, secondStringTemp.Length];

    // loop over all items in the table.
    for (int i = 0; i < firstStringTemp.Length; i++)
    {
        for (int j = 0; j < secondStringTemp.Length; j++)
        {

            temp[i, j] = new List<string>();
            if (i == 0 || j == 0) continue;
            if (firstStringTemp[i] == secondStringTemp[j])
            {
                var a = firstStringTemp[i].ToString();
                if (temp[i - 1, j - 1].Count == 0)
                {
                    temp[i, j].Add(a);
                }
                else
                {
                    foreach (string s in temp[i - 1, j - 1])
                    {
                        temp[i, j].Add(s + a);
                    }
                }
            }
            else
            {
                List<string> b = temp[i - 1, j].Concat(temp[i, j - 1]).Distinct().ToList();
                if (b.Count == 0) continue;
                int max = b.Max(p => p.Length);
                b = b.Where(p => p.Length == max).ToList();
                temp[i, j] = b;
            }
        }
    }
    return temp[firstStringTemp.Length - 1, secondStringTemp.Length - 1];
}

您需要在表的每个条目中设置一个集合。因此,您仍然可以在表的每个单元格中保留长度相同的不同字符串。