简单的csv阅读器?

时间:2010-08-24 15:19:18

标签: c# csv data-structures

所有

我从一开始就认为这将是一项非常简单的任务。 (将csv转换为“wiki”格式)但是我遇到了一些我无法解决问题的障碍

我有3个主要问题

1)一些单元格包含\ r \ n(因此,当逐行读取时,将每个新行视为新单元格

2)一些行包含“,”(我尝试切换到\ t删除文件但我仍然遇到一个问题,当它在两个之间时逃逸“”)

3)除了delmiter(“,”或“\ t”)之外,有些行是完全空白的,其他行是不完整的(这很好,我只需要确保单元格在正确的位置)

我已经尝试了一些CSV阅读器类,但是它们会碰到上面列出的一些问题

我试图让这个应用程序尽可能小,所以我也试图避免dll和大型课程,只有一小部分做我想要的。

到目前为止,我有两次“尝试不起作用”

尝试1(在单元格中不是handel \ r \ n)

OpenFileDialog openFileDialog1 = new OpenFileDialog();

            openFileDialog1.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
            openFileDialog1.Filter = "tab sep file (*.txt)|*.txt|All files (*.*)|*.*";
            openFileDialog1.FilterIndex = 1;
            openFileDialog1.RestoreDirectory = true;

            if (openFileDialog1.ShowDialog() == DialogResult.OK)
            {
                if (cb_sortable.Checked)
                {
                    header = "{| class=\"wikitable sortable\" border=\"1\" \r\n|+ Sortable table";
                }

                StringBuilder sb = new StringBuilder();
                string line;
                bool firstline = true;
                StreamReader sr = new StreamReader(openFileDialog1.FileName);

                sb.AppendLine(header);

                while ((line = sr.ReadLine()) != null)
                {

                    if (line.Replace("\t", "").Length > 1)
                    {
                        string[] hold;
                        string lead = "| ";

                        if (firstline && cb_header.Checked == true)
                        {
                            lead = "| align=\"center\" style=\"background:#f0f0f0;\"| ";
                        }

                        hold = line.Split('\t');
                        sb.AppendLine(table);
                        foreach (string row in hold)
                        {
                            sb.AppendLine(lead + row.Replace("\"", ""));
                        }


                        firstline = false;
                    }
                }
                sb.AppendLine(footer);
                Clipboard.SetText(sb.ToString());
                MessageBox.Show("Done!");
        }


        }
        string header = "{| class=\"wikitable\" border=\"1\" ";
        string footer = "|}";
        string table = "|-";

尝试2(可以处理\ r \ n但是将单元格移到空白单元格上)(尚未完成)

OpenFileDialog openFileDialog1 = new OpenFileDialog();

        openFileDialog1.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
        openFileDialog1.Filter = "txt file (*.txt)|*.txt|All files (*.*)|*.*";
        openFileDialog1.FilterIndex = 1;
        openFileDialog1.RestoreDirectory = true;

        if (openFileDialog1.ShowDialog() == DialogResult.OK)
        {
            if (cb_sortable.Checked)
            {
                header = "{| class=\"wikitable sortable\" border=\"1\" \r\n|+ Sortable table";
            }


            using (StreamReader sr = new StreamReader(openFileDialog1.FileName))
            {


                string text = sr.ReadToEnd();
                string[] cells = text.Split('\t');
                int columnCount = 0;
                foreach (string cell in cells)
                {

                    if (cell.Contains("\r\n"))
                    {
                        break;
                    }
                    columnCount++;
                }          


            }

基本上我所需要的只是“如果不是在”之间“分开”,但我现在只是亏本

任何提示或技巧都将不胜感激

5 个答案:

答案 0 :(得分:3)

答案 1 :(得分:2)

你也可以看看http://www.filehelpers.com/ ......

如果可以使用库,请不要自己尝试!

答案 2 :(得分:1)

试一试here。您的代码不会发出Web请求,但实际上这会向您展示如何解析从Web服务返回的csv。

答案 3 :(得分:1)

这里有一个不错的实施......

在这种情况下使用久经考验的代码而不是尝试自己编写代码更有意义。

答案 4 :(得分:1)

对于基本上两页长的a specification,CSV格式的简单性具有欺骗性。可以在互联网上找到的大多数短解析器实现在这种或那种方面都是公然不正确的。尽管如此,这种格式似乎并不需要1k + SLOC实现。

public static class CsvImport {
    /// <summary>
    /// Parse a Comma Separated Value (CSV) source into rows of strings. [1]
    /// 
    /// The header row (if present) is not treated specially. No checking is
    /// performed to ensure uniform column lengths among rows. If no input
    /// is available, a single row containing String.Empty is returned. No
    /// support is provided for debugging invalid CSV files. Callers who
    /// desire such assistance are encouraged to use a TextReader that can
    /// report the current line and column position.
    /// 
    /// [1] http://tools.ietf.org/html/rfc4180
    /// </summary>
    public static IEnumerable<string[]> Deserialize(TextReader input) {
        if (input.Peek() == Sentinel) yield return new [] { String.Empty };
        while (input.Peek() != Sentinel) {
            // must read in entire row *now* to see if we're at end of input
            yield return DeserializeRow(input).ToArray(); 
        }
    }

    const int Sentinel = -1;
    const char Quote = '"';
    const char Separator = (char)System.Globalization.CultureInfo.CurrentCulture.TextInfo.ListSeparator;

    static IEnumerable<string> DeserializeRow(TextReader input) {
        var field = new StringBuilder();
        while (true) {
            var c = input.Read();
            if (c == Separator) {
                yield return field.ToString();
                field = new StringBuilder();
            } else if (c == '\r') {
                if (input.Peek() == '\n') {
                    input.Read();
                }
                yield return field.ToString();
                yield break;
            } else if (new [] { '\n', Sentinel }.Contains(c)) {
                yield return field.ToString();
                yield break;
            } else if (c == Quote) {
                field.Append(DeserializeQuoted(input));
            } else {
                field.Append((char) c);
            }
        }
    }

    static string DeserializeQuoted(TextReader input) {
        var quoted = new StringBuilder();
        while (input.Peek() != Sentinel) {
            var c = input.Read();
            if (c == Quote) {
                if (input.Peek() == Quote) {
                    quoted.Append(Quote);
                    input.Read();
                } else {
                    return quoted.ToString();
                }
            } else {
                quoted.Append((char) c);
            }
        }
        throw new UnexpectedEof("End-of-file inside quoted section.");
    }

    public class UnexpectedEof : Exception {
        public UnexpectedEof(string message) : base(message) { }
    }
}