将复杂的制表​​符分隔文件(多个列行和换行符)读入对象

时间:2013-08-19 12:36:23

标签: vb.net

我有一个标签分隔的数据文件。每个对象彼此分开,有2个换行符,每个对象的第一行和第三行是列名。

我的标签分隔文件

ID         [TAB] NAME     
001        [TAB] Croline            
DATE       [TAB] DOC
30/06/2010 [TAB] 101435

2 x EMPTY LINE                      

ID         [TAB] NAME     
002        [TAB] Grek            
DATE       [TAB] DOC   
30/06/2010 [TAB] 101437

2 x EMPTY LINE

...........
...........

我的对象类

Public Class MyObject
    Public Property Id As String
    Public Property Name As String
    Public Property Date As String
    Public Property Doc As String
End Class

如何将此文件读入MyObjects?

3 个答案:

答案 0 :(得分:1)

解决方案将类似于(伪代码):

Create an empty list of MyObjects
Open file for reading
While there are lines left to read:
    create a MyObject instance i
    read a line and ignore it.
    read a line into s1
    split s1 at tab character into a and b
    set i.Id to a1
    set i.Name to b1
    read a line and ignore it
    read a line into s2
    split s2 at tab character into a and b
    set i.Date to a2
    set i.Doc to b2
    add i to your list

    read a line and ignore it.
    read a line and ignore it.

将其翻译成vb.net留给读者练习。

答案 1 :(得分:1)

很难帮助你理解如何做到这一点,而不是更明确地知道你遇到这个任务的哪个部分,但也许是一个简单的工作示例,帮助你开始。

如果您定义了这样的数据类:

Public Class MyObject
    Public Property Id As String
    Public Property Name As String
    Public Property [Date] As String  ' Note that "Date" must be surrounded with brackets since it is a keyword in VB
    Public Property Doc As String
End Class

然后你可以像这样加载它:

' Create a list to hold the loaded objects
Dim objects As New List(Of MyObject)()

' Read all of the lines from the file into an array of strings
Dim lines() As String = File.ReadAllLines("test.txt")

' Loop through the array of lines from the file.  Step by 7 each 
' time so that the current value of "i", at each iteration, will 
' be the index of the first line of each object
For i As Integer = 0 To lines.Length Step 7
    If lines.Length >= i + 3 Then

        ' Create a new object to store the data for the current item in the file
        Dim o As New MyObject()

        ' Get the values from the second line
        Dim fields() As String = lines(i + 1).Split(ControlChars.Tab)
        o.Id = fields(0)
        o.Name = fields(1)

        ' Get the values from the fourth line
        fields = lines(i + 3).Split(ControlChars.Tab)
        o.Date = fields(0)
        o.Doc = fields(1)

        ' Add this item to the list
        objects.Add(o)
    End If
Next

加载它的代码非常基本。它没有额外的验证来确保文件中的数据格式正确,但是,如果有一个有效的文件,它将工作将数据加载到对象列表中。

答案 2 :(得分:0)

我首先将其转换为简单的CSV,而不是为该数据编写特定代码。如果这只是一次性的事情可能会有意义。

1)在Notepad ++中加载文件

2)将\t替换为;(使用扩展搜索模式),为您提供此类数据:

ID;NAME
001;Croline
DATE;DOC
30/06/2010;101435


ID;NAME
002;Grek
DATE;DOC
30/06/2010;101437

3)通过搜索DATE;DOC并替换为DATE;DOC\n删除所有;行(您可能需要执行编辑> EOL转换> UNIX格式为了工作)

4)执行相同操作,用保证不在数据中使用的占位符符号替换所有ID;NAME行,可能是£。您的数据应如下所示:

£001;Croline
;30/06/2010;101435


£002;Grek
;30/06/2010;101437

5)编辑>空白操作>删除不必要的空白和EOL ,这应该将您的所有数据放在一行。

6)搜索&将您的占位符符号£替换为\n

7)删除顶部的额外空白行。 Voilà,你有一个CSV文件。

相关问题