如何将xml转换为结构化的csv

时间:2016-05-05 08:56:04

标签: python xml csv

我有以下xml数据,我想将其转换为csv:

<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country2 name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country2>
    <country3 name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country3>
</data>

如何在csv中将每个国家/地区行显示为1行,其中包含每个国家/地区内的所有属性。我无法弄清楚如何阅读多个国家/地区。

我有以下代码:

import os
from xml.etree import ElementTree as ET



rootElement = ET.parse("/Users/testuser/Desktop/test.XML").getroot()


with open('/Users/testuser/Desktop/output.csv', 'wb') as csvfile:
    writer = csv.writer(csvfile, lineterminator='\n')


    for subelement in rootElement:
        for subsub in subelement:
            print subsub.tag
            writer.writerows(subsub.items())
            for subsubsub in subsub:
                print subsubsub.items()
                writer.writerows(subsubsub.items())

这并不能让我得到理想的输出。在我的CSV中,我希望每行代表一个国家/地区。所以第1行是国家,第2行是country2。然后每列应该给我子元素的值。所以在我的csv A1中将是国家/地区,A2将是国家/等级,依此类推。 B1是country2 / name,B2是country2 / rank ..

目前我得到以下输出:

enter image description here

1 个答案:

答案 0 :(得分:2)

import csv
import xml.etree.ElementTree as ET
tree = ET.parse('/tmp/test123.xml')
root = tree.getroot()
doc = open('/tmp/output123.csv', 'w')
for child in root:
    for country in root.findall(str(child.tag)):
        rec = '%s,%s,%s' %(str(child.tag), str(country.find('rank').text), str(country.find('gdppc').text))
    for neighbor in country.findall('neighbor'):
        rec= rec+ ',' + str(neighbor.attrib['name']) +',' + str(neighbor.attrib['direction'])
    doc.write(rec + '\n')
doc.close()

output.csv

country,2,141100,Austria,E,Switzerland,W
country2,5,59900,Malaysia,N
country3,69,13600,Costa Rica,W,Colombia,E