使用CSV解析结构化文本文件

时间:2017-05-26 01:27:26

标签: powershell csv parsing export

我是PowerShell的新手并进行过搜索和搜索,但找不到我想要做的解决方案。我想监视一组文件夹,如果对任何文本文件进行了更改,那么我想将该数据插入到csv文件中。文本文件的结构将始终如下,但值(冒号后)可能会有所不同......

Type:   1 Red/1 Blue
SecondaryType:  
Keywords:   
Area:   150
Length: 28
Width:  22
System: 55.5cm
DateTime:   5/5/2017 10:06:38 PM
UserName:   bgates
Platform:   Major Platform 2017
CustomIdentifier:   1.11.0645.1330
Version:    14.116.65557.111

之后:有一个标签,然后可能有也可能没有值。

我把一些代码拼凑在一起并且它导出了csv,但没有正确解析,我正在考虑因为选项卡而且经常缺少数据。这是我的代码:

$watcher = New-Object System.IO.FileSystemWatcher
$watcher.Path = "C:\Users\username\Desktop\03_InProgress"
$watcher.Filter = "*.txt"
$watcher.IncludeSubdirectories = $true
$watcher.EnableRaisingEvents = $true  


$action = { $path = $Event.SourceEventArgs.FullPath
            $changeType = $Event.SourceEventArgs.ChangeType
            $logline = "$(Get-Date), $changeType, $path"
            (Get-Content $path) -join "`r`n" -Split "(?m)^(?=\S)" |
                Where{$_} | 
                ForEach{
                    Clear-Variable PrimaryType,SecondaryType,Keywords,Area,Length,Width,System,DateTime,Username,Platform,CustomIdentifier,Version
                    Switch -regex ($_ -split "`r`n"){
                        "PrimaryType:" {$PrimaryType = ($_ -split ':',2)[-1].trim();Continue}
                        "SecondaryType:" {$SecondaryType = ($_ -split ':',2)[-1].trim();Continue}
                        "Keywords:" {$Keywords = ($_ -split ':',2)[-1].trim();Continue}
                        "Area:" {$Area = ($_ -split ':',2)[-1].trim();Continue}
                        "Length:" {$Length = ($_ -split ':',2)[-1].trim();Continue}
                        "Width:" {$Width = ($_ -split ':',2)[-1].trim();Continue}
                        "System:" {$System = ($_ -split ':',2)[-1].trim();Continue}
                        "DateTime:" {$DateTime = ($_ -split ':',2)[-1].trim();Continue}
                        "Username:" {$Username = ($_ -split ':',2)[-1].trim();Continue}
                        "Platform:" {$Platform = ($_ -split ':',2)[-1].trim();Continue}
                        "CustomIdentifier:" {$CustomIdentifier = ($_ -split ':',2)[-1].trim();Continue}
                        "Version:" {$Version = ($_ -split ':',2)[-1].trim();Continue}
                    }
                    [PSCustomObject]@{
                        'PrimaryType' = $PrimaryType
                        'SecondaryType' = $SecondaryType
                        'Keywords' = $Keywords
                        'Area' = $Area
                        'Length' = $Length
                        'Width' = $Width
                        'System' = $System
                        'DateTime' = $DateTime
                        'Username' = $Username
                        'Platform' = $Platform
                        'CustomIdentifier' = $CustomIdentifier
                        'Version' = $Version }

                    $Files | ForEach{ [PSCustomObject]@{'PrimaryType' = $PrimaryType; 'SecondaryType' = $SecondaryType; 'Keywords' = $Keywords; 'Area' = $Area; 'Length' = $Length; 'Width' = $Width; 'System' = $System; 'DateTime' = $DateTime; 'Username' = $Username; 'Platform' = $Platform; 'CustomIdentifier' = $CustomIdentifier; 'Version' = $Version}}
                } | Export-Csv -path "C:\Users\username\Desktop\Smart Scrape\test.csv" -NoTypeInformation
            ###Add-content "C:\Users\username\Desktop\Smart Scrape\log.txt" -value $logline
          }    
Register-ObjectEvent $watcher "Created" -Action $action
Register-ObjectEvent $watcher "Changed" -Action $action
Register-ObjectEvent $watcher "Deleted" -Action $action
Register-ObjectEvent $watcher "Renamed" -Action $action
while ($true) {sleep 5}

2 个答案:

答案 0 :(得分:1)

其中一个文件的简单解析器可能是:

$Entries = [ordered]@{}

Get-Content $Path | ForEach-Object {

    $Key, $Value = $_ -split ':', 2
    $Entries[$Key.Trim()] = $Value.Trim()

}

[PSCustomObject]$Entries | Export-Csv -Append -Path "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -NoTypeInformation

只用冒号拆分每一行,只拆分为2以避免分解其中有冒号的日期时间,然后左右移动并将它们存储在散列表(字典)中,然后将其转换为PSCustomObject用于输出。

它不知道字段名称是什么,并且它看起来不需要。

答案 1 :(得分:0)

试试这个

get-content $Path | ConvertFrom-Csv -Delimiter ":" -Header Name, Value | export-csv "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv"  -notype -append

带别名的短版

gc $Path | ConvertFrom-Csv -D ":" -h Name, Value | epcsv "C:\Users\jrooker\Desktop\SmartPlan Scrape\test.csv" -not -a