将XML转换为Dict / JSON

时间:2018-07-08 18:43:42

标签: python json xml list dictionary

我需要将XML文件转换为JSON。

XML脚本示例如下:

<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API 0.7.55.3 9da5e7ae">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2018-06-17T15:31:02Z"/>
  <node id="330268305" lat="52.5475000" lon="13.3850775">
    <tag k="direction" v="240-60"/>
    <tag k="tourism" v="viewpoint"/>
    <tag k="wheelchair" v="no"/>
  </node>
  <node id="330269757" lat="52.5473115" lon="13.3843131">
    <tag k="direction" v="240-60"/>
    <tag k="tourism" v="viewpoint"/>
    <tag k="wheelchair" v="limited"/>
  </node>
  <way id="281307598">
    <center lat="52.4934004" lon="13.4843019"/>
    <nd ref="2852755795"/>
    <nd ref="3772363803"/>
    <nd ref="3772363802"/>
    <nd ref="2852755796"/>
    <nd ref="2852755797"/>
    <nd ref="2852755798"/>
    <nd ref="2852755795"/>
    <tag k="man_made" v="tower"/>
    <tag k="tourism" v="viewpoint"/>
    <tag k="tower:type" v="observation"/>
    <tag k="wheelchair" v="yes"/>
  </way>
</osm>

到目前为止执行的代码

import xml.etree.ElementTree as ET
import json

input_file = r"D:\berlin\trial_xml\berlin_viewpoint_locations.xml"

tree = ET.parse(input_file)
root = tree.getroot()

result_list = [{k: (item.get(k) if k != 'extra' else
                    {i.get('k'): i.get('v') for i in item.iter('tag')})
                for k in ('id', 'lat', 'lon', 'extra')}
               for item in tree.findall("./node") + tree.findall('./way')]

print(result_list)

在一些Stackoverflow专家的协助下,我已经取得了半完成的结果。但是,我仍然需要了解如何:

  1. 附加坐标,该坐标隐藏在<center lat="52.4934004" lon="13.4843019"/> result_list for'id'`中的nodes. It works for中,与here一样。
  2. 以与<nd ref="2852755795"/> <nd ref="3772363803"/>相同的方式附加所有引用extra,例如嵌套列表。

1 个答案:

答案 0 :(得分:1)

当前代码对您不起作用的原因是数据结构不同。我建议为每种nodeway类型使用独立的解析器。您已经在解析node类型,因此要解析way,可以构造一个相当简单的循环,如下所示:

way_list = []
for item in tree.findall("./way"):
    # get the center node
    center = item.find('center')

    # get the refs for the nd nodes
    nds = [nd.get('ref') for nd in item.iter('nd')]

    # construct a dict and append to result list
    way_list.append(dict(
        id=item.get('id'),
        lat=center.get('lat'),
        lon=center.get('lon'),
        nds=nds,
        extra={i.get('k'): i.get('v') for i in item.iter('tag')},
    ))
print(way_list)

结果:

[{
    'id': '281307598', 
    'lat': '52.4934004', 
    'lon': '13.4843019', 
    'nds': ['2852755795', '3772363803', '3772363802', '2852755796',
            '2852755797', '2852755798', '2852755795'],
    'extra': {    
        'man_made': 'tower', 
        'tourism': 'viewpoint', 
        'tower:type': 'observation', 
        'wheelchair': 'yes'    
    }
}]