在Python中将数据框转换为xml而不进行迭代?
输入数据框:
A B C D
aa ab ac ad
aaa abb acc add
以XML输出:
<A>aa</A>
<B>ab</B>
<C>ac</C>
<D>ad</D>
<A>aaa</A>
<B>abb</B>
<C>acc</C>
<D>add</D>
答案 0 :(得分:2)
给定数据框x:
>>> import pandas as pd
>>> x = pd.DataFrame([['aa','ab','ac','ad'],['aaa','abb','acc','add']],columns=['A','B','C','D'])
>>> x
A B C D
0 aa ab ac ad
1 aaa abb acc add
您可以使用此功能。但是,不能保证此处使用的pandas和numpy函数内部没有循环。
>>> import numpy as np
>>> def to_xml(df):
...
... #extract columns and repeat them by number of rows
... cols = df.columns.tolist()*len(df.index)
...
... #convert df to numpy and reshape columns to one vector
... df_numpy = np.array(df)
... df_numpy = df_numpy.reshape(np.dot(*df_numpy.shape))
...
... #convert columns and numpy array to pandas and apply function that formats each row, convert to list
... listlike = pd.DataFrame([df_numpy,cols]).apply(lambda x: '<{0}>{1}</{0}>'.format(x[1],x[0])).tolist()
...
... #return list of rows joined with newline character
... return '\n'.join(listlike)
输出:
>>> print(to_xml(x))
<A>aa</A>
<B>ab</B>
<C>ac</C>
<D>ad</D>
<A>aaa</A>
<B>abb</B>
<C>acc</C>
<D>add</D>