Powershell:解析结构化文本文件并保存为.CSV

时间:2012-07-18 20:07:30

标签: parsing powershell text csv

我对Powershell很新。仅使用它约2周。

我有一个结构如下的文件:

Service name: WSDL 
Service ID: 14234321885 
Service resolution path: /gman/wsdlUpdte 
Serivce endpoints: 
-------------------------------------------------------------------------------- 
Service name: DataService 
Service ID: 419434324305 
Service resolution path: /widgetDate_serv/WidgetDateServ 
Serivce endpoints:  
http://servername.company.com:1012/widgetDate_serv/WidgetDateServ
-------------------------------------------------------------------------------- 
Service name: SearchService 
Service ID: 393234543546 
Service resolution path: /ProxyServices/SearchService 
Serivce endpoints:  
http://servername.company.com:13010/Services/SearchService_5_0
http://servername2.company.com:13010/Services/SearchService_5_0
-------------------------------------------------------------------------------- 
Service name: Worker 
Service ID: 14187898547 
Service resolution path: /ProxyServices/Worker 
Serivce endpoints:  
http://servername.company.com:131009/Services/Worker/v9
--------------------------------------------------------------------------------

我想解析文件并在单个列(CSV)中包含服务名称,服务ID,服务解析路径和服务端点(有时包含多个值或没有值)。

除了使用Get-Content并循环浏览文件外,我甚至不知道从哪里开始。

任何帮助将不胜感激。 感谢

4 个答案:

答案 0 :(得分:2)

使用PowerShell 5,您可以使用神奇的命令'convertfrom-string'

$template=@'
Service name: {ServiceName*:SearchService} 
Service ID: {serviceID:393234543546} 
Service resolution path: {ServicePath:/ProxyServices/SearchService} 
Serivce endpoints:
http://{ServiceEP*:servername.company.com:13010/Services/SearchService_5_0}
http://{ServiceEP*:servername2.tcompany.tcom:13011/testServices/SearchService_45_0}
--------------------------------------------------------------------------------
Service name: {ServiceName*:Worker} 
Service ID: {serviceID:14187898547} 
Service resolution path: {ServicePath:/ProxyServices/Worker} 
Serivce endpoints:
http://{ServiceEP*:servername3.company.com:13010/Services/SearchService}
--------------------------------------------------------------------------------
Service name: {ServiceName*:WSDL} 
Service ID: {serviceID:14234321885} 
Service resolution path: {ServicePath:/gman/wsdlUpdte} 
Serivce endpoints:
http://{ServiceEP*:servername4.company.com:13010/Services/SearchService_5_0}
--------------------------------------------------------------------------------
'@


#explode file with template
$listexploded=Get-Content -Path "c:\temp\file1.txt" | ConvertFrom-String -TemplateContent $template

#export csv 
$listexploded |select *, @{N="ServiceEP";E={$_.ServiceEP.Value -join ","}} -ExcludeProperty ServiceEP | Export-Csv -Path "C:\temp\res.csv" -NoTypeInformation

答案 1 :(得分:1)

试试这个:

Get-Content | ? { $_ -match ': ' } | % { $_ -split ': ' } | Export-Csv Test.csv;

基本上归结为:

  1. 将所有文字内容作为数组
  2. 过滤包含':'
  3. 的行
  4. 对于剩下的每一行,将其拆分为':'
  5. 将对象数组导出到名为 test.csv
  6. 的CSV文件

    希望这能指出你正确的方向。

    注意:代码未经测试。

答案 2 :(得分:1)

尝试一下:

  1. 将文件内容作为一个字符串阅读
  2. 用81连字符拆分
  3. 拆分冒号char上的每个拆分项并取最后一个数组项
  4. 为每个项目创建新对象

    $pattern = '-'*81  
    $content = Get-Content D:\Scripts\Temp\p.txt | Out-String
    $content.Split($pattern,[System.StringSplitOptions]::RemoveEmptyEntries) | Where-Object {$_ -match '\S'} | ForEach-Object {
    
    $item = $_ -split "\s+`n" | Where-Object {$_}
    
        New-Object PSobject -Property @{
            Name=$item[0].Split(':')[-1].Trim()
            Id = $item[1].Split(':')[-1].Trim()
            ResolutionPath=$item[2].Split(':')[-1].Trim()
            Endpoints=$item[4..($item.Count)]
        } | Select-Object Name,Id,ResolutionPath,Endpoints
    }
    

答案 3 :(得分:0)

这是解析具有记录和记录记录(等等)的文件的一般方法,它使用强大的PowerShell switch指令和正则表达式以及begin(),Process(),end()函数模板。

加载,调试,纠正它......

function Parse-Text
{
  [CmdletBinding()]
  Param
  (
    [Parameter(mandatory=$true,ValueFromPipeline=$true)]
    [string]$ficIn,
    [Parameter(mandatory=$true,ValueFromPipeline=$false)]
    [string]$ficOut
  )

  begin
  {
    $svcNumber = 0
    $urlnum = 0
    $Service = @()
    $Service += @{}
  } 

  Process 
  {
    switch -regex -file $ficIn
    {
      # End of a service
      "^-+"
      {
        $svcNumber +=1
        $urlnum = 0
        $Service += @{}
      }
      # URL, n ones can exist
      "(http://.+)" 
      {
        $urlnum += 1
        $url = $matches[1]
        $Service[$svcNumber]["Url$urlnum"] = $url
      }
      # Fields
      "(.+) (.+): (.+)" 
      {
        $name,$value = $matches[2,3]
        $Service[$svcNumber][$name] = $value
      }
    }
  }

  end 
  {
    #$service[3..0] | % {New-Object -Property $_ -TypeName psobject} | Export-Csv c:\Temp\ws.csv
    # Get all the services except the last one (empty -> the file2Parse is teerminated by ----...----)
    $tmp = $service[0..($service.count-2)] | Sort-Object @{Expression={$_.keys.count };Descending=$true}
    $tmp | % {New-Object -Property $_ -TypeName psobject} | Export-Csv $ficOut
  }
}


Clear-Host
Parse-Text -ficIn "c:\Développements\Pgdvlp_Powershell\Apprentissage\data\Text2Parse.txt" -ficOut "c:\Temp\ws.csc"
cat "c:\Temp\ws.csv"