分割逗号分隔值(CSV)

时间:2009-06-26 05:40:59

标签: c# csv

如何在c sharp中拆分CSV文件?以及如何显示这个?

8 个答案:

答案 0 :(得分:12)

我一直在使用Microsoft.VisualBasic.FileIO命名空间中的TextFieldParser Class来处理我正在处理的C#项目。它将处理嵌入式逗号或引号括起来的字段等复杂情况。它返回一个字符串[],除了CSV文件外,还可以用于解析任何类型的结构化文本文件。

答案 1 :(得分:5)

显示在哪里?关于拆分,最好的方法是使用一个好的库来实现这种效果。

This library非常好,我可以尽情推荐。

使用天真方法的问题是,通常会失败,有很多考虑因素,甚至没有考虑性能:

  • 如果文字包含逗号
  • ,该怎么办?
  • 支持许多现有格式(以分号分隔,或用引号或单引号括起来的文字等)
  • 和其他许多人

答案 2 :(得分:3)

导入Micorosoft.VisualBasic作为参考(我知道,并非那么糟糕)并使用Microsoft.VisualBasic.FileIO.TextFieldParser - 它可以很好地处理CSV文件,并且可以在任何.Net语言中使用。

答案 3 :(得分:1)

一次读取一行文件,然后......

foreach (String line in line.Split(new char[] { ',' }))
    Console.WriteLine(line);

答案 4 :(得分:1)

这是我偶尔使用的CSV解析器。

用法:( dgvMyView是一种数据网格类型。)

CSVReader reader = new CSVReader("C:\MyFile.txt");
reader.DisplayResults(dgvMyView);

类别:

using System.IO;
using System.Text.RegularExpressions;
using System.Windows.Forms;    
public class CSVReader
{
    private const string ESCAPE_SPLIT_REGEX = "({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*";
    private string[] FieldNames;
    private List<string[]> Records;
    private int ReadIndex;

    public CSVReader(string File)
    {
        Records = new List<string[]>();
        string[] Record = null;
        StreamReader Reader = new StreamReader(File);
        int Index = 0;
        bool BlankRecord = true;

        FieldNames = GetEscapedSVs(Reader.ReadLine());
        while (!Reader.EndOfStream)
        {
            Record = GetEscapedSVs(Reader.ReadLine());
            BlankRecord = true;
            for (Index = 0; Index <= Record.Length - 1; Index++)
            {
                if (!string.IsNullOrEmpty(Record[Index])) BlankRecord = false;
            }
            if (!BlankRecord) Records.Add(Record);
        }
        ReadIndex = -1;
        Reader.Close();
    }

    private string[] GetEscapedSVs(string Data)
    {
        return GetEscapedSVs(Data, ",", "\"");
    }
    private string[] GetEscapedSVs(string Data, string Separator, string Escape)
    {
        string[] Result = null;
        int Index = 0;
        int PriorMatchIndex = 0;
        MatchCollection Matches = Regex.Matches(Data, string.Format(ESCAPE_SPLIT_REGEX, Separator, Escape));

        Result = new string[Matches.Count];


        for (Index = 0; Index <= Result.Length - 2; Index++)
        {
            Result[Index] = Data.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
            PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
        }
        Result[Result.Length - 1] = Data.Substring(PriorMatchIndex);

        for (Index = 0; Index <= Result.Length - 1; Index++)
        {
            if (Regex.IsMatch(Result[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) Result[Index] = Result[Index].Substring(1, Result[Index].Length - 2);
            Result[Index] = Result[Index].Replace(Escape + Escape, Escape);
            if (Result[Index] == null) Result[Index] = "";
        }

        return Result;
    }

    public int FieldCount
    {
        get { return FieldNames.Length; }
    }

    public string GetString(int Index)
    {
        return Records[ReadIndex][Index];
    }

    public string GetName(int Index)
    {
        return FieldNames[Index];
    }

    public bool Read()
    {
        ReadIndex = ReadIndex + 1;
        return ReadIndex < Records.Count;
    }


    public void DisplayResults(DataGridView DataView)
    {
        DataGridViewColumn col = default(DataGridViewColumn);
        DataGridViewRow row = default(DataGridViewRow);
        DataGridViewCell cell = default(DataGridViewCell);
        DataGridViewColumnHeaderCell header = default(DataGridViewColumnHeaderCell);
        int Index = 0;
        ReadIndex = -1;

        DataView.Rows.Clear();
        DataView.Columns.Clear();

        for (Index = 0; Index <= FieldCount - 1; Index++)
        {
            col = new DataGridViewColumn();
            col.CellTemplate = new DataGridViewTextBoxCell();
            header = new DataGridViewColumnHeaderCell();
            header.Value = GetName(Index);
            col.HeaderCell = header;
            DataView.Columns.Add(col);
        }

        while (Read())
        {
            row = new DataGridViewRow();
            for (Index = 0; Index <= FieldCount - 1; Index++)
            {
                cell = new DataGridViewTextBoxCell();
                cell.Value = GetString(Index).ToString();
                row.Cells.Add(cell);
            }
            DataView.Rows.Add(row);
        }
    }
}

答案 5 :(得分:1)

我的查询得到了结果。就像我用io.file读取文件一样简单。并且所有文本都存储在一个字符串中。之后,我与一个分离器分开了。代码如下所示。

using System;
using System.Collections.Generic;
using System.Text;

namespace CSV
{
    class Program
    {
        static void Main(string[] args)
        {

            string csv = "user1, user2, user3,user4,user5";

            string[] split = csv.Split(new char[] {',',' '});
            foreach(string s in split)
            {
                if (s.Trim() != "")
                    Console.WriteLine(s);
            }
            Console.ReadLine();
        }
    }
}

答案 6 :(得分:0)

以下函数从CSV文件中取一行并将其拆分为List<string>

参数:
string line =要分割的行
string textQualifier =什么(如果有的话)文本限定符(即“”或“\”“或”'“) char delim =字段分隔符(即','或';'或'|'或'\ t')
int colCount =预期的字段数(0表示不检查)

使用示例:

List<string> fields = SplitLine(line, "\"", ',', 5);
// or
List<string> fields = SplitLine(line, "'", '|', 10);
// or
List<string> fields = SplitLine(line, "", '\t', 0);

功能:

private List<string> SplitLine(string line, string textQualifier, char delim, int colCount)
{
    List<string> fields = new List<string>();
    string origLine = line;

    char textQual = '"';
    bool hasTextQual = false;
    if (!String.IsNullOrEmpty(textQualifier))
    {
        hasTextQual = true;
        textQual = textQualifier[0];            
    }

    if (hasTextQual)
    {
        while (!String.IsNullOrEmpty(line))
        {
            if (line[0] == textQual) // field is text qualified so look for next unqualified delimiter
            {
                int fieldLen = 1;
                while (true)
                {
                    if (line.Length == 2) // must be final field (zero length)
                    {
                        fieldLen = 2;
                        break;
                    }
                    else if (fieldLen + 1 >= line.Length) // must be final field
                    {
                        fieldLen += 1;
                        break;
                    }
                    else if (line[fieldLen] == textQual && line[fieldLen + 1] == textQual) // escaped text qualifier
                    {
                        fieldLen += 2;
                    }
                    else if (line[fieldLen] == textQual && line[fieldLen + 1] == delim) // must be end of field
                    {
                        fieldLen += 1;
                        break;
                    }
                    else // not a delimiter
                    {
                        fieldLen += 1;
                    }
                }
                string escapedQual = textQual.ToString() + textQual.ToString();
                fields.Add(line.Substring(1, fieldLen - 2).Replace(escapedQual, textQual.ToString())); // replace escaped qualifiers
                if (line.Length >= fieldLen + 1)
                {
                    line = line.Substring(fieldLen + 1);
                    if (line == "") // blank final field
                    {
                        fields.Add("");
                    }
                }
                else
                {
                    line = "";
                }
            }
            else // field is not text qualified
            {
                int fieldLen = line.IndexOf(delim);
                if (fieldLen != -1) // check next delimiter position
                {
                    fields.Add(line.Substring(0, fieldLen));
                    line = line.Substring(fieldLen + 1);
                    if (line == "") // final field must be blank 
                    {
                        fields.Add("");
                    }
                }
                else // must be last field
                {
                    fields.Add(line);
                    line = "";
                }
            }
        }
    }
    else // if there is no text qualifier, then use existing split function
    {
        fields.AddRange(line.Split(delim));
    }      

    if (colCount > 0 && colCount != fields.Count) // count doesn't match expected so throw exception
    {
        throw new Exception("Field count was:" + fields.Count.ToString() + ", expected:" + colCount.ToString() + ". Line:" + origLine);

    }
    return fields;
}

答案 7 :(得分:0)

问题: 将逗号分隔的字符串转换为数组,其中“引用字符串,,”中的逗号不应被视为分隔符,而应视为条目的一部分

输入: String: First,"Second","Even,With,Commas",,Normal,"Sentence,with ""different"" problems",3,4,5

输出: String-Array: ['First','Second','Even,With,Commas','','Normal','Sentence,with "different" problems','3','4','5']

代码:

string sLine;
sLine = "First,\"Second\",\"Even,With,Commas\",,Normal,\"Sentence,with \"\"different\"\" problems\",3,4,5";

// 1. Split line by separator; do not split if separator is within quotes
string Separator = ",";
string Escape = '"'.ToString();
MatchCollection Matches = Regex.Matches(sLine,
    string.Format("({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*", Separator, Escape));
string[] asColumns = new string[Matches.Count + 1];

int PriorMatchIndex = 0;
for (int Index = 0; Index <= asColumns.Length - 2; Index++)
{
    asColumns[Index] = sLine.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
    PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
}
asColumns[asColumns.Length - 1] = sLine.Substring(PriorMatchIndex);

// 2. Remove quotes
for (int Index = 0; Index <= asColumns.Length - 1; Index++)
{
    if (Regex.IsMatch(asColumns[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) // If "Text" is sourrounded by quotes (but ignore double quotes => "Leave ""inside"" quotes")
    {
        asColumns[Index] = asColumns[Index].Substring(1, asColumns[Index].Length - 2); // "Text" => Text
    }
    asColumns[Index] = asColumns[Index].Replace(Escape + Escape, Escape); // Remove double quotes ('My ""special"" text' => 'My "special" text')
    if (asColumns[Index] == null) asColumns[Index] = "";
}

输出数组为asColumns