如何将几个XML文件解析为几个CSV文件?

时间:2019-07-05 03:25:36

标签: python xml pandas csv elementtree

我使用此代码解析了XML文件,该代码适用于单个xml输入到单个csv输出。我尝试使用glob处理多个输入以及多个csv输出,但是我知道这是不正确的。

import glob
import xml.etree.ElementTree as et
import csv

for file in glob.glob('./*.xml'):
    with open(file) as f:
        tree = et.parse(f)
        nodes = tree.getroot()

        with open(f'{f[:-4]}edited.csv', 'w') as ff:
            cols = ['dateTime','x','y','z','motion','isMoving','stepCount','groupAreaId','commit']
            nodewriter = csv.writer(ff)
            nodewriter.writerow(cols)
            for node in nodes:
                values = [ node.attrib.get(kk, '') for kk in cols]
                nodewriter.writerow(values)

如何更改以获取多个csv输出?

2 个答案:

答案 0 :(得分:1)

您的代码当前正在使用文件句柄来形成输出文件名。代替f,使用file如下:

import glob
import xml.etree.ElementTree as et
import csv

for file in glob.glob('./*.xml'):
    with open(file) as f:
        tree = et.parse(f)
        nodes = tree.getroot()

        with open(f'{file[:-4]}edited.csv', 'w') as ff:
            cols = ['dateTime','x','y','z','motion','isMoving','stepCount','groupAreaId','commit']
            nodewriter = csv.writer(ff)
            nodewriter.writerow(cols)
            for node in nodes:
                values = [ node.attrib.get(kk, '') for kk in cols]
                nodewriter.writerow(values)

答案 1 :(得分:0)

您可以创建文件名列表,然后在其中写入xml文件。如果输出文件已经在存储区中,则可以使用glob获得名称。如果文件不存在,将使用给定的文件名创建以下代码

csvFileNames = ['outputfile1.csv', 'outputfile2.csv']
for file in csvFileNames:
    with open(file, 'w') as f:
        wtr = csv.writer(f)
        wtr.writerows( [[1, 2], [2, 3], [4, 5]]) # write what you want

要从目录获取XML文件名,可以尝试以下代码:

from os import listdir
filenames = listdir('.') # here dot is used because script and csv files are in the same directory, if XML files are in other directory then set the path inside listdir
xmlFileNames = [ filename for filename in filenames if filename.endswith( ".xml" ) ]

# get xml file names like this, xmlFileNames = ["abc.xml", "ef.xml"]
resultCsvFileNameList = [fname.replace(".xml", ".csv") for fname in xmlFileNames ]