如何通过跳过备用行来读取制表符分隔的行

时间:2014-04-22 01:51:32

标签: c# asp.net split skip tab-delimited

我目前能够从大的制表符分隔文件中解析和提取数据。我正在逐行读取,解析和提取,并在我的数据表中添加拆分项(行限制一次添加3行)。我需要跳过偶数行,即读取第一个最大制表符分隔行,然后跳过第二个,直接读取第三个。

我的制表符分隔的源文件格式

001Mean                   26.975                  1.1403                  910.45                   
001Stdev                  26.975                  1.1403                  910.45                   
002Mean                   26.975                  1.1403                  910.45                   
002Stdev                  26.975                  1.1403                  910.45                   

需要跳过或避免阅读Stdev制表符分隔线。

C#代码:

通过拆分行

获取文件的制表符分隔行中的项目的最大长度
using (var reader = new StreamReader(sourceFileFullName))
        {
            string line = null;
            line = reader.ReadToEnd();

            if (!string.IsNullOrEmpty(line))
            {
                var list_with_max_cols = line.Split('\n').OrderByDescending(y => y.Split('\t').Count()).Take(1);
                foreach (var value in list_with_max_cols)
                {
                   var values = value.ToString().Split(new[] { '\t', '\n' }).ToArray();
                   MAX_NO_OF_COLUMNS = values.Length;
                }
            }
        }

逐行读取文件,直到制表符分隔行中的最大长度满足作为解析和提取的第一行

using (var reader = new StreamReader(sourceFileFullName))
        {
            string new_read_line = null;
            //Read and display lines from the file until the end of the file is reached.                
            while ((new_read_line = reader.ReadLine()) != null)
            {
                            var items = new_read_line.Split(new[] { '\t', '\n' }).ToArray();
                            if (items.Length != MAX_NO_OF_COLUMNS)                         
                            continue;
                //when reach first line it is column list need to create datatable based on that.
                if (firstLineOfFile)
                {

                    columnData = new_read_line;
                    firstLineOfFile = false;
                    continue;
                }
                if (firstLineOfChunk)
                {
                    firstLineOfChunk = false;
                    chunkDataTable = CreateEmptyDataTable(columnData);
                }
                    AddRow(chunkDataTable, new_read_line);
                chunkRowCount++;

                if (chunkRowCount == _chunkRowLimit)
                {
                    firstLineOfChunk = true;
                    chunkRowCount = 0;
                    yield return chunkDataTable;
                    chunkDataTable = null;
                }
            }
        }

创建数据表:

private DataTable CreateEmptyDataTable(string firstLine)
    {

        IList<string> columnList = Split(firstLine);
        var dataTable = new DataTable("TableName");
        for (int columnIndex = 0; columnIndex < columnList.Count; columnIndex++)
        {
            string c_string = columnList[columnIndex];
            if (Regex.Match(c_string, "\\s").Success)
            {
                string tmp = Regex.Replace(c_string, "\\s", "");
                string finaltmp = Regex.Replace(tmp, @" ?\[.*?\]", ""); // To strip strings inside [] and inclusive [] alone
                columnList[columnIndex] = finaltmp;

            }
        }
        dataTable.Columns.AddRange(columnList.Select(v => new DataColumn(v)).ToArray());
        dataTable.Columns.Add("ID");
        return dataTable;

    }

How to skip lines by reading alternatively and split and then add to my datatable !!!

AddRow功能:通过添加以下更改来管理以实现我的要求!!!

private void AddRow(DataTable dataTable, string line)
    {

        if (line.Contains("Stdev"))
        {
            return;
        }
        else
        {
          //Rest of Code
        }

    }

2 个答案:

答案 0 :(得分:2)

考虑到每行中都有制表符分隔值,如何读取奇数行并将它们拆分为数组。这只是一个样本;你可以扩展这个。

测试数据(file.txt)

luck    is  when    opportunity meets   preparation
this    line    needs   to  be  skipped
microsoft   visual  studio
another line    to  be  skipped
let us  all code

代码

var oddLines = File.ReadLines(@"C:\projects\file.txt").Where((item, index) => index%2 == 0);
foreach (var line in oddLines)
{
     var words = line.Split('\t');
}

调试屏幕截图

Image 1

Image 2

修改

获取不包含&#39; Stdev&#39;

的行
var filteredLines = System.IO.File.ReadLines(@"C:\projects\file.txt").Where(item => !item.Contains("Stdev"));

答案 1 :(得分:0)

更改

using (var reader = new StreamReader(sourceFileFullName))
    {
        string new_read_line = null;
        //Read and display lines from the file until the end of the file is reached.                
        while ((new_read_line = reader.ReadLine()) != null)
        {
                        var items = new_read_line.Split(new[] { '\t', '\n' }).ToArray();
                        if (items.Length != MAX_NO_OF_COLUMNS)                         
                        continue;

using (var reader = new StreamReader(sourceFileFullName))
    {

        int cnt = 0;
        string new_read_line = null;
        //Read and display lines from the file until the end of the file is reached.                
        while ((new_read_line = reader.ReadLine()) != null)
        {
                        cnt++;

                        if(cnt % 2 == 0)
                           continue;
                        var items = new_read_line.Split(new[] { '\t', '\n' }).ToArray();
                        if (items.Length != MAX_NO_OF_COLUMNS)                         
                        continue;