Python:重写一个漂亮的XML文件

时间:2016-05-13 12:24:17

标签: python xml elementtree minidom

我有一个XML文件,如下所示:

<Main>

    <Stuff author="Jojo" name="Thing 1">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description" />
        <Attr name="version" value="1.0.0" />
        <Attr name="software" value="Misrocoft Ociffe" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

    <Stuff author="Toto" name="Thing 2">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

</Main>

我更新它然后重写它但问题是,如果我用prettyxml重写它,我会在旧行之间找到新的空格,如下所示:

<Main>


    <Stuff author="Jojo" name="Thing 1">


        <Attr name="annotation" value="Short description" />


        <Attr name="description" value="Long description" />


        <Attr name="version" value="1.0.0" />


        <Attr name="software" value="Misrocoft Ociffe" />


        <Attr name="language" value="Python" />


        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />


        <Attr name="command" value="doSomething()" />


    </Stuff>

    <Stuff author="Toto" name="Thing 2">


        <Attr name="annotation" value="Short description" />


        <Attr name="description" value="Long description"/>


        <Attr name="version" value="4.3.9" />


        <Attr name="software" value="Tophoshop" />


        <Attr name="language" value="Python" />


        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />


        <Attr name="command" value="doSomething()" />


    </Stuff>

    <Stuff author="Titi" name="New thing">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

</Main>

如果我改写它toxml我根本就没有缩进或空格:

<Main>

    <Stuff author="Jojo" name="Thing 1">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description" />
        <Attr name="version" value="1.0.0" />
        <Attr name="software" value="Misrocoft Ociffe" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

    <Stuff author="Toto" name="Thing 2">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>

<Stuff author="Titi" name="New thing"><Attr name="annotation" value="Short description" /><Attr name="description" value="Long description"/><Attr name="version" value="4.3.9" /><Attr name="software" value="Tophoshop" /><Attr name="language" value="Python" /><Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" /><Attr name="command" value="doSomething()" /></Stuff></Main>

有没有办法输出一个新的漂亮的XML,它不会修改文件的现有格式? 我正在考虑将XML更改为单行字符串,然后在prettyxml中重写它,但我不知道该怎么做或者是否可能(我使用etreeminidom获取信息)。

更新(回答):

以下是我最终制作的代码,请注意我的rootXml来自ElementTree。

from xml.dom import minidom
import xml.etree.ElementTree as ET

def writeXml(rootXml, xmlFile):

    roughString = ET.tostring(rootXml, 'utf-8')
    oneLineString = ''.join([s.strip() for s in roughString.splitlines()])

    minidomXml = minidom.parseString(oneLineString)
    rootMinidom = minidomXml.firstChild

    prettyXmlString = rootMinidom.toprettyxml()
    prettyXml = ET.fromstring(prettyXmlString)

    with open(xmlFile, "w") as f:
        f.write (ET.tostring(prettyXml))

将返回以下xml:

<Main>
    <Stuff author="Jojo" name="Thing 1">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description" />
        <Attr name="version" value="1.0.0" />
        <Attr name="software" value="Misrocoft Ociffe" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>
    <Stuff author="Toto" name="Thing 2">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>
    <Stuff author="Titi" name="New thing">
        <Attr name="annotation" value="Short description" />
        <Attr name="description" value="Long description"/>
        <Attr name="version" value="4.3.9" />
        <Attr name="software" value="Tophoshop" />
        <Attr name="language" value="Python" />
        <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" />
        <Attr name="command" value="doSomething()" />
    </Stuff>
</Main>

1 个答案:

答案 0 :(得分:1)

据我所知,没有干净的方法来修复minidom toprettyxml() *。最简单的方法之一可能是使用BeautifulSoup&#39; prettify()。例如,您的单行Stuff元素已按prettify()正确分隔为包含缩进的新行:

>>> from bs4 import BeautifulSoup
>>> raw = '''<Stuff author="Titi" name="New thing"><Attr name="annotation" value="Short description" /><Attr name="description" value="Long description"/><Attr name="version" value="4.3.9" /><Attr name="software" value="Tophoshop" /><Attr name="language" value="Python" /><Attr name="path" value="/here/there/aroundHere/somewhere/file.ext" /><Attr name="command" value="doSomething()" /></Stuff>'''
>>> soup = BeautifulSoup(raw, "xml")
>>> print soup.prettify()
<?xml version="1.0" encoding="utf-8"?>
<Stuff author="Titi" name="New thing">
 <Attr name="annotation" value="Short description"/>
 <Attr name="description" value="Long description"/>
 <Attr name="version" value="4.3.9"/>
 <Attr name="software" value="Tophoshop"/>
 <Attr name="language" value="Python"/>
 <Attr name="path" value="/here/there/aroundHere/somewhere/file.ext"/>
 <Attr name="command" value="doSomething()"/>
</Stuff>

*)参考文献: