使用Powershell将嵌套的xml转换为csv时遇到问题

时间:2019-05-15 12:11:31

标签: xml powershell csv

我有一个嵌套的XML,我需要使用Powershell将其转换为CSV。 不幸的是,我在网上还是处于初级水平,无法使用现有线程解决此问题。

我尝试了将XML文件读入Powershell并创建一个新对象的尝试,但是我导出到csv甚至不包含那个不充分的结果...:(

我拥有的XML文件如下:

<?xml version="1.0" encoding="ISO-8859-1"?>
<Data source="Jhonny" datetime="2019-04-23T10:07:50+02:00" timezone="Europe">
    <dealerships>
        <location name="Germany">
            <series parameter="Sold Cars" unit="car">
                <value datetime="2019-04-22T00:00:00+02:00" value="7.3"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="7.8"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="7.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="6.0"/>
            </series>
            <series parameter="Sold Cars" unit="Auto">
                <value datetime="2019-04-22T00:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="4.0"/>
            </series>
        </location>
        <location name="USA">
            <series parameter="Sold Cars" unit="car">
                <value datetime="2019-04-22T00:00:00+02:00" value="5.1"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="4.1"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="3.6"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="3.1"/>
            </series>
            <series parameter="Sold Cars" unit="Auto">
                <value datetime="2019-04-22T00:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="3.0"/>
            </series>
        </location>
    </dealerships>
</Data>

我想要的结果看起来像这样:

Location;Date/Time;Sold Cars car;Sold Cars Auto
Germany; 2019-04-22T00:00:00+02:00; 7.3;4.0
Germany; 2019-04-22T00:00:00+02:00; 7.8;5.0
Germany; 2019-04-22T00:00:00+02:00; 7.0;3.0
Germany; 2019-04-22T00:00:00+02:00; 6.0;4.0
USA; 2019-04-22T00:00:00+02:00; 5.1;3.0
USA; 2019-04-22T00:00:00+02:00; 4.1;6.0
USA; 2019-04-22T00:00:00+02:00; 3.6;1.0
USA; 2019-04-22T00:00:00+02:00; 3.1;8.0

由于我实际上还没到什么地方,所以我认为我的代码没有帮助,但是这里是 我试图解决的方法,但失败了:

$xml = "C:\Users\[me]\Convert_XML_to_CSV\cars.xml"
$obj = New-Object System.XML.XMLDocument
$obj.Load("$xml")

foreach ($i in $_.Data.dealerships.location) {
    $o = New-Object Object
    Add-Member -InputObject $o -MemberType NoteProperty -Name location -Value $obj.Data.dealerships.Location $i $o
} | Export-Csv "result.csv" -Delimiter "," -NoType -Encoding UTF8

2 个答案:

答案 0 :(得分:2)

也许与您期望的输出所显示的不完全相同,但这可能会有所帮助。

注意:我在这里使用xml字符串。在您的情况下,请使用

从文件中加载
[xml]$xml = Get-Content "C:\Users\[me]\Convert_XML_to_CSV\cars.xml"

代码:

[xml]$xml = @'
<?xml version="1.0" encoding="ISO-8859-1"?>
<Data source="Jhonny" datetime="2019-04-23T10:07:50+02:00" timezone="Europe">
    <dealerships>
        <location name="Germany">
            <series parameter="Sold Cars" unit="car">
                <value datetime="2019-04-22T00:00:00+02:00" value="7.3"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="7.8"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="7.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="6.0"/>
            </series>
            <series parameter="Sold Cars" unit="Auto">
                <value datetime="2019-04-22T00:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="4.0"/>
            </series>
        </location>
        <location name="USA">
            <series parameter="Sold Cars" unit="car">
                <value datetime="2019-04-22T00:00:00+02:00" value="5.1"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="4.1"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="3.6"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="3.1"/>
            </series>
            <series parameter="Sold Cars" unit="Auto">
                <value datetime="2019-04-22T00:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="3.0"/>
            </series>
        </location>
    </dealerships>
</Data>
'@ 

$result = foreach ($item in $xml.Data.dealerships.location) {
    $location = $item.Name

    # get the different column names
    $units = $item.series | ForEach-Object { '{0} {1}' -f $_.parameter, $_.unit}

    # loop through the series
    foreach ($series in $item.series) {
        # and the values
        foreach ($value in $series.value) {
            # since you are using PowerShell 2.0, create the output object like this
            $objOut = New-Object -TypeName PSObject
            $objOut | Add-Member -MemberType NoteProperty -Name 'Location' -Value $location
            $objOut | Add-Member -MemberType NoteProperty -Name 'DateTime' -Value $value.datetime

            $thisUnit = '{0} {1}' -f $series.parameter, $series.unit
            # add the different units as property.
            foreach ($unit in $units) { 
                $val = if ($unit -eq $thisUnit) { $value.value } else { '' }
                $objOut | Add-Member -MemberType NoteProperty -Name $unit -Value $val 
            }

            # output the object
            $objOut
        }
    }
}

# output on screen
$result | Format-Table -AutoSize
# output to CSV file
$result | Export-Csv -Path 'D:\test.csv' -Encoding UTF8 -NoTypeInformation

结果:

Location DateTime                  Sold Cars car Sold Cars Auto
-------- --------                  ------------- --------------
Germany  2019-04-22T00:00:00+02:00 7.3                         
Germany  2019-04-22T01:00:00+02:00 7.8                         
Germany  2019-04-22T02:00:00+02:00 7.0                         
Germany  2019-04-22T03:00:00+02:00 6.0                         
Germany  2019-04-22T00:00:00+02:00               4.0           
Germany  2019-04-22T01:00:00+02:00               4.0           
Germany  2019-04-22T02:00:00+02:00               4.0           
Germany  2019-04-22T03:00:00+02:00               4.0           
USA      2019-04-22T00:00:00+02:00 5.1                         
USA      2019-04-22T01:00:00+02:00 4.1                         
USA      2019-04-22T02:00:00+02:00 3.6                         
USA      2019-04-22T03:00:00+02:00 3.1                         
USA      2019-04-22T00:00:00+02:00               3.0           
USA      2019-04-22T01:00:00+02:00               3.0           
USA      2019-04-22T02:00:00+02:00               3.0           
USA      2019-04-22T03:00:00+02:00               3.0


更新

根据注释中的请求,您可以像上面的代码一样进一步组合/分组$result数组:

$combined = $result | Group-Object -Property DateTime, Location | ForEach-Object {
    foreach ($location in ($_.Group | Group-Object Location)) {
        # create an output object and put in the Location property here
        $objOut = New-Object -TypeName PSObject
        $objOut | Add-Member -MemberType NoteProperty -Name 'Location' -Value ($location.Name)
        foreach ($date in ($location.Group | Group-Object DateTime)) {
            # add the DateTime property
            $objOut | Add-Member -MemberType NoteProperty -Name 'DateTime' -Value ($date.Name)
            foreach ($unit in $_.Group) {
                # join the other two properties to the $objOut object:
                # I do not want to hard-code the property names here, 
                # so use Select-Object to get the remaining props.
                $sold = $unit | Select-Object * -ExcludeProperty Location, DateTime
                foreach ($thing in $sold.psobject.properties | Where-Object { ($_.Value) }) {
                    # if you want the numbers as floating-point numbers, do this:
                    # $objOut | Add-Member -MemberType NoteProperty -Name $($thing.Name) -Value ([double]$thing.Value)
                    # like below, these values will be output as string
                    $objOut | Add-Member -MemberType NoteProperty -Name $($thing.Name) -Value ($thing.Value)
                }
            }
        }
        $objOut
    }
}

# output on screen
$combined | Format-Table -AutoSize
# output to CSV file
$combined | Export-Csv -Path 'D:\test_Grouped.csv' -Encoding UTF8 -NoTypeInformation

这将导致:

Location DateTime                  Sold Cars car Sold Cars Auto
-------- --------                  ------------- --------------
Germany  2019-04-22T00:00:00+02:00 7.3           4.0           
Germany  2019-04-22T01:00:00+02:00 7.8           4.0           
Germany  2019-04-22T02:00:00+02:00 7.0           4.0           
Germany  2019-04-22T03:00:00+02:00 6.0           4.0           
USA      2019-04-22T00:00:00+02:00 5.1           3.0           
USA      2019-04-22T01:00:00+02:00 4.1           3.0           
USA      2019-04-22T02:00:00+02:00 3.6           3.0           
USA      2019-04-22T03:00:00+02:00 3.1           3.0

答案 1 :(得分:1)

这有点棘手。我通过使用PowerShell的本机解析功能解析XML,然后通过.location逐步遍历节点来处理此问题,从而为我们提供了按位置划分的列表(因此,我们有一个用于美国,一个用于德国,等等)< / p>

在第一个循环中,每个位置都有两个序列,一个序列的单位是汽车,一个序列的单位是汽车。因此,接下来我们找到seriesunitcar的{​​{1}},以售出所有汽车。然后我们foreach逐步解决这些问题。

在最深层嵌套的循环cars中,我们从Auto系列中找到一条匹配记录,并按日期时间进行匹配。

这为我们提供了以PowerShell 2.0格式制作PSCustomObject所需的所有属性。我进行了测试,期望的输出看起来与您想要的东西正好相符。

$dealerships = ([xml]$x).Data.dealerships.location

foreach ($location in $dealerships){
    $cars = $location.series | Where-Object {$_.unit -eq 'car'}
    foreach ($car in $cars.value){
        $auto = $location.series | Where-Object {$_.unit -eq 'Auto'} | Select-Object -ExpandProperty value | Where-Object {$_.datetime -eq $car.datetime}

        $ObjectProperties = @{
            Location = $location.name
            DateTime = $car.datetime
            SoldCars = $car.value
            SoldAutos= $auto.value
        }
        New-Object PSObject -Property $ObjectProperties
    }
}