使用xml.etree.ElementTree获取XML行

时间:2019-04-30 09:24:39

标签: python xml

我正在搜索一个函数,该函数的参数是Integer(Line),返回值将xml-Line赋予该Integer。

我有一个很大的XMl文件,我想将其减少为许多较小的文件。每个输出文件都有一个开始标记和结束标记

例如

输入文件:: Test.xml

输出文件:

Test1.xml Test2.xml Test3.xml Test4.xml

tree = etree.parse(file_name)
root = tree.getroot()

# Here i count the number of XMl Lines in my file
xml_lines = 0

for child in root:
    xml_lines +=1

# Here i want to get the String of my XMl Line by giving the number
for i in range(counter,counter+number_of_each_file):
            d.write(FUNCTION)

1 个答案:

答案 0 :(得分:0)

我认为您应该更改将大XML文件拆分为较小XML文件的方法。 XML不在乎行。它关心元素。您的函数应获取大XML的根目录,dest_file_name_prefix和代表每个小XML文件中所需元素的数字。

类似的东西:

def split_xml(root,dest_file_name_prefix,num_of_elements):
    """ Loop around the elements under to root and save a each collection of 'num_of_elements' to a file  having a unique  name """
    root = tree.getroot()
    elements = root.findall('.//element')
    counter = 0 
    temp = []
    for idx,element in enumerate(elements)
        temp.append(element)
        if idx > 0 and idx % num_of_elements == 0:
            # save the elements to a 'small' file
            counter += 1
            file_name = '{}_{}'.format(dest_file_name_prefix,counter)
            #TODO I assume you know how to save the elements from temp to a file  
            temp = []

大XML示例

<root>
   <element id="0"></element>
   <element id="1"></element>
   <element id="2"></element>
   ...
   <element id="10000"></element>
</root>