Question

我今天早些时候提出了这个问题，但是我删除了它，因为我犯了一个错误并且没有提供所有信息。对于那个很抱歉。我使用Regex和Linq在匹配模式的文件夹中搜索所有.txt文件。该部分代码如下所示：

private static IEnumerable<string> Search(string file)
        {
            return File
                .ReadLines(file)
                .Select(line => Regex.Match(line, pattern))
                .Where(match => match.Success)
                .Select(match => match.Value)
                .ToArray();
        }

然后将匹配项写入.txt：

var files = Directory.EnumerateFiles(filePath, "*.txt");
var extracts = files.SelectMany(file => Search(file));
File.WriteAllLines("results.txt", extracts);

总有写匹配来源的文件名吗？我在字符串数组中有文件名。

var filenames = Directory.GetFiles(filePath, "*.txt")
                .Select(filename => Path.GetFileNameWithoutExtension(filename))
                .Select(filename => Regex.Match(filename, namePattern))
                .Where(match => match.Success)
                .Select(match => match.Value)
                .ToArray();

我的目标“ results.txt”示例：

Example1 file1

Example2 file2

Example3 file3

“示例”是已经起作用的部分，高亮部分是我想以某种方式实现的。

RenéVogt曾给我一个答案，但经过多次尝试，我仍然无法解决问题。他的代码是这样的：

var extracts = files.SelectMany(file => Search(file).Select(match => new {match, file}));
File.WriteAllLines("results.txt", 
                   extracts.Select(extract => extract.match + " " + extract.file));

它给我这样的结果：

Example1 D：\〜.txt1

反正我可以使用namePattern剪切掉文本中不必要的部分吗？

非常感谢您！

编辑： ：感谢您到目前为止提供的所有帮助！我正在尝试所有答案。 namePattern 与 pattern 不同。我使用 pattern 从文本中获取重要的字符串，并使用 namePattern 剪切名称的不必要部分。我正在获取所需的文件名部分，我在结果文本文件中苦苦挣扎。我只能写出没有模式的整个文件名，否则每次尝试以某种方式包含模式时都会失败。

static string namePattern = @"(\d{4})(?!.*\d)";
static string pattern = @"[A-Z]{1}\d{7}\s\d{1,5}";

Answer 1

如果要将文件名附加到每一行

private static IEnumerable<string> Search(string file)
{
    return File
        .ReadLines(file)
        .Select(line => Regex.Match(line, pattern))
        .Where(match => match.Success)
        .Select(match => match.Value + " " + file)
        .ToArray();
}

如果您要在文件名后附加一行

private static IEnumerable<string> Search(string file)
{
    return (File
            .ReadLines(file)
            .Select(line => Regex.Match(line, pattern))
            .Where(match => match.Success)
            .Select(match => match.Value)
           )
           .Concat(Enumerable.Repeat(file, 1))
           .ToArray();
}

由于方法返回IEnumerable<string>，因此您也可以删除ToArray()。这会导致表达式的计算延迟，即在File.WriteAllLines中调用表达式时。如果文件很大或文件很多，这可能是一个优点，因为结果不会被缓冲。读取的每一行都将立即写入输出中。

Answer 2

您可以使用此代码段。它将删除最后一个反斜杠之前的所有内容，包括反斜杠本身

string filename = Regex.Match(filename, @".*\\([^\\]+$)").Groups[1].Value;

Answer 3

尝试以下操作：

             string[] filenames = Directory.GetFiles(filePath, "*.txt")
                 .Where(x => Regex.IsMatch(Path.GetFileNameWithoutExtension(x), namePattern))
                 .Select(x => string.Format("***{0}***", x)).ToArray();

Answer 4

使用此方法：

private string GetFileName(string file)
{
    return file.Split('\\').Last();
}

然后是这个

private static IEnumerable<string> Search(string file)
        {
            return File
                .ReadLines(file)
                .Select(line => Regex.Match(line, pattern))
                .Where(match => match.Success)
                .Select(match => match.Value+" "+GetFileName(file))
                .ToArray();
        }

您将得到想要的东西。

或者直接这样：

private static IEnumerable<string> Search(string file)
        {
            return File
                .ReadLines(file)
                .Select(line => Regex.Match(line, pattern))
                .Where(match => match.Success)
                .Select(match => match.Value+" "+file.Split('\\').Last())
                .ToArray();
        }

如何在字符串末尾包含名称的特定部分

4 个答案: