删除行中的重复单词

时间:2014-07-09 20:21:34

标签: c# regex wpf string

我有一个带TextBox的WPF应用程序,我需要在输入此按钮后创建一个格式化函数:

Perfume Soap Random52
Sample id: Random52
Key: 1324354657
Bubble Shampoo aRandom88
Sample id: aRandom88
Key: 1234567890
BathSoda Monkey 101
Sample id: Monkey 101
Key: 0192837465

到这个输出:

Perfume Soap
Bubble Shampoo
BathSoda

- 删除第2行和第2行3并从第1行删除重复的单词,包括原文。 它看起来很简单,但我遇到了麻烦。我迷路了,不知道怎么继续。我怎样才能使它发挥作用?

4 个答案:

答案 0 :(得分:2)

假设输入每次都具有相同的结构,使用LINQ的可能解决方案可能如下所示(没有正则表达式):

    var input = @"Perfume Soap Random52
Sample id: Random52
Key: 1324354657
Bubble Shampoo aRandom88
Sample id: aRandom88
Key: 1234567890
BathSoda Monkey 101
Sample id: Monkey 101
Key: 0192837465";

    var result = input
    // take all lines
    .Split('\n')
    // for each line
    .Select ((text, index) => 
    {
        // take only the desired lines
        if ((index == 0) || (index % 3) == 0 || (index % 6) == 0)
        {
            // split line on space
            var words = text.Split((char)32);
            // take desired words
            return String.Format("{0} {1}", words[0], (index != 0 && (index % 6) == 0) ? String.Empty : words[1]); 
        }

        return String.Empty;
    })
    // remove empty entries
    .Where (text => !String.IsNullOrEmpty(text));
    // join the lines again to one string again; separator is new line;
    Console.WriteLine(String.Join("\r\n", result.ToArray()));

输出符合要求:

Perfume Soap
Bubble Shampoo
BathSoda 

使用正则表达式替换单词是一个好主意。

答案 1 :(得分:1)

    var types = new List<string>();
    var previous = string.Empty;
    foreach (string line in text.Split(new string[] { "\r\n" }, StringSplitOptions.None))
        if (line.StartsWith("Sample id: "))
            types.Add(previous.Substring(0, previous.Length - line.Split(':')[1].Length));
        else
            previous = line;

答案 2 :(得分:1)

正则表达式可以用来提取出3行,然后你可以投影结果。

Regex regex = new Regex(@"(?<desc>.*)\nSample id: (?<id>.*)\nKey: (?<key>.*)");

var r = regex.Matches(content).Cast<Match>().Select(m => new {
    Description = m.Groups["desc"].Value.Replace(" " + m.Groups["id"].Value, ""),
    Id = m.Groups["id"].Value,
    Key = m.Groups["key"].Value });

或者只获得每组的第一行。使用相同的正则表达式:

var r = regex.Matches(content).Cast<Match>().Select(m => m.Groups["desc"].Value.Replace(" " + m.Groups["id"].Value, ""));

答案 3 :(得分:-1)

最后我找到了如何实现我想要的方法,所以我将与其他人分享:

string input = "Perfume Soap Random52\n" +
               "Sample id: Random52\n" +
               "Key: 1324354657\n" +
               "Bubble Shampoo aRandom88\n" +
               "Sample id: aRandom88\n" +
               "Key: 1234567890\n" +
               "BathSoda Monkey 101\n" +
               "Sample id: Monkey 101\n" +
               "Key: 0192837465";

// split lines so each one is a different element of an Array
string[] split2 = input.Split('\n');

string output;

for (int i = 1; i < split2.Count(); i += 3) {
    output = split2[i - 1].Trim();

  // count number of words in first line (will use it later)
    string[] wordsList = output.Split(' ');
    int wordsCount = wordsList.Count();

  // combine lines 1 and 2 to begin duplicates removal process
    output += " " + split2[i].Trim();
    string[] split = output.Split(' ');

  // group elements together and filter duplicates
var singles = split.GroupBy(x = > x).Where(g = > g.Count() == 1).SelectMany(g = > g);
  // number of words removed from first line
    int cCount = (split.Count() - singles.Count()) / 2;
  // number of words remaining in first line after duplicate removal
    int wCount = wordsCount - cCount;

    output = string.Empty;

  // I don't know how to convert 'singles' type to array, so i did it my way
    foreach(string f in singles) {
        output += " " + f;
    }
  // output should now have all words inside without duplicates

    string[] oArray = output.Split(' '); // array full of output words
    output = string.Empty;

  // add only names to output, as I requested
    for (int c = 0; c <= wCount; c++)
    {
        output += oArray[c] + " ";
    }

    output = output.Trim(); // delete spaces around for cleaner looks

输出:

Perfume Soap
Bubble Shampoo
BathSoda