在Python中解析XML字符串

时间:2015-09-10 06:06:50

标签: python xml

我有这个XML字符串结果,我需要获取标签之间的值。但XML的数据类型是字符串。

  final = "  <Table><Claimable>false</Claimable><MinorRev>80601</MinorRev><Operation>530600 ION MILL</Operation><HTNum>162</HTNum><WaferEC>80318</WaferEC><HolderType>HACARR</HolderType><Job>167187008</Job></Table>

    <Table><Claimable>false</Claimable><MinorRev>71115</MinorRev><Operation>530600 ION MILL</Operation><Experiment>6794</Experiment><HTNum>162</HTNum><WaferEC>71105</WaferEC><HolderType>HACARR</HolderType><Job>16799006</Job></Table> "

这是我的代码示例

root = ET.fromstring(final)
print root

这是我收到的错误:

xml.parsers.expat.ExpatError: The markup in the document following the root element must be well-formed.

我尝试过使用ET.fromstring。但没有运气。

1 个答案:

答案 0 :(得分:3)

也许您尝试了node.attrib,请尝试使用node.text获取字符串值(另请参阅Python文档中的Parsing XML):

import xml.etree.ElementTree as ET
xml_string = "<Table><Claimable>false</Claimable><MinorRev>80601</MinorRev><Operation>530600 ION MILL</Operation><HTNum>162</HTNum><WaferEC>80318</WaferEC><HolderType>HACARR</HolderType><Job>167187008</Job></Table>"

root = ET.fromstring(xml_string)

for child in root:
    print child.tag, child.text

这应该给你

Claimable false
MinorRev 80601
Operation 530600 ION MILL
HTNum 162
WaferEC 80318
HolderType HACARR
Job 167187008