需要帮助弄清楚为什么我的csv输出为空白吗?

时间:2019-07-17 21:33:09

标签: python xml csv

我写了这段代码从.xml文件创建.csv报告,但是当我打开生成的.csv时,它是空白的。随意撕开我的代码,顺便说一下,我对此很陌生,想学习!

xml中有多个“ Subjectkeys”,但是只有一些具有“ AuditRecord”。我只想提取具有审计记录的数据,然后,对于这些数据,我想从“ SubjectData”,“ FormData”和“ AuditRecord”中提取其信息

import csv
import xml.etree.cElementTree as ET

tree = ET.parse("response.xml")
root = tree.getroot()

xml_data_to_csv =open("query.csv", 'w')



AuditRecord_head = []
SubjectData_head = []
FormData_head = []

csvwriter=csv.writer(xml_data_to_csv)
count=0
for member in root.findall("AuditRecord"):
    AuditRecord = []
    Subjectdata = []
    FormData = []
    if count == 0:
        Subject = member.find("SubjectKey").tag
        Subjectdata_head.append(Subject)
        Form = member.find("p1Name").tag
        FormData_head.append(Form)
        Action = member.find("Action").tag
        AuditRecord_head.append(Action)
        csvwriter.writerow(Auditrecord_head)
        count = count + 1
    Subject = member.find('SubjectKey').text
    Subjectdata.append(Subject)
    Form = member.find('p1Name').text
    FormData.append(Form)
    Action = member.find("Action").text
    AuditRecord.append(Action)

    csvwriter.writerow(Subjectdata)
xml_data_to_csv.close()

我希望输出结果是一个表,列标题为:Subject,Form,Action。

以下是示例.xml:

 </ClinicalData>
    <ClinicalData StudyOID="SMK-869-002" MetaDataVersionOID="2.0">
    <SubjectData SubjectKey="865-015">
</AuditRecord>
</FormData>
<FormData p1:Name="Medical History" p1:Started="Y" FormOID="mh" FormRepeatKey="0"/>
<FormData p1:Name="Medical History" p1:Started="Y" FormOID="mh" FormRepeatKey="1">
<p1:QueryAction InitialComment="Please enter start date for condition" UserType="User" UserOID="bailey@protocolfirst.com" Action="query" DateTimeStamp="2019-07-12T14:08:43.893Z"/>
</AuditRecord>

1 个答案:

答案 0 :(得分:0)

首先,您的xml文件有很多错误,对我来说,它必须看起来像:

<?xml version="1.0"?>
<root xmlns:p1="http://some-url.com">
    <ClinicalData StudyOID="SMK-869-002" MetaDataVersionOID="2.0"></ClinicalData>
    <SubjectData SubjectKey="865-015"></SubjectData>
    <AuditRecord>
        <FormData p1:Name="Medical History" p1:Started="Y" FormOID="mh" FormRepeatKey="0"/>
        <FormData p1:Name="Medical History" p1:Started="Y" FormOID="mh" FormRepeatKey="1"/>
        <p1:QueryAction InitialComment="Please enter start date for condition" UserType="User" UserOID="bailey@protocolfirst.com" Action="query" DateTimeStamp="2019-07-12T14:08:43.893Z"/>
    </AuditRecord>
</root>

ElementTree始终只希望有一个根节点和一个格式正确的文档。

我不太了解您要做什么,但是我希望这可以对您有所帮助:

import xml.etree.cElementTree as ET

tree = ET.parse("response.xml")
root = tree.getroot()

xml_data_to_csv = open("query.csv", 'w')

list_head=[]

count=0
for member in root.findall("AuditRecord"):
    AuditRecord = []
    Subjectdata = []
    FormData = []
    if count == 0:
        Subjectdata.append(root.find('./SubjectData').attrib['SubjectKey'])

        for formData in root.findall('./AuditRecord/FormData'):
            #print(formData.attrib['{http://some-url.com}Name'])
            FormData.append(formData.attrib['{http://some-url.com}Name'])

        AuditRecord.append(root.find('./AuditRecord/{http://some-url.com}QueryAction').attrib['Action'])

        xml_data_to_csv.write(Subjectdata[0] + "," + FormData[0] + "," + FormData[1] + "," + AuditRecord[0])
        count = count + 1

xml_data_to_csv.close()

这将产生一个具有以下内容的csv文件:

865-015,Medical History,Medical History,query