python将一个json结构转换为嵌套结构

时间:2016-11-08 01:34:22

标签: python json converter

如何将以下json格式转换为下面的目标格式?我有5万个参赛作品 基本上,从每个阵列中获取唯一的国家/地区,并将所有其他国家/地区名称包含在一个阵列中。

原json:

var a = [1, 3, 2],
  b = [5, 2, 1, 4],
  c = [2, 1],
  d = [9, 3];

function uniteUnique(arrays) {
  return Array.from(new Set([].slice.call(arguments).reduce((a, b) => {
    return a.concat(b);
  })));
}

var r = uniteUnique(a, b, c, d);
console.log(r);

目标格式:

[
    {
        "unilist": [
                {
                    "country": "United States",
                    "name": "The College of New Jersey",
                    "web_page": "http://www.tcnj.edu"
                },
                {
                    "country": "United States",
                    "name": "Abilene Christian University",
                    "web_page": "http://www.acu.edu/"
                },
                {
                    "country": "United States",
                    "name": "Adelphi University",
                    "web_page": "http://www.adelphi.edu/"
                },
                {
                    "country": "China",
                    "name": "Harbin Medical University",
                    "web_page": "http://www.hrbmu.edu.cn/"
                },
                {
                    "country": "China",
                    "name": "Harbin Normal University",
                    "web_page": "http://www.hrbnu.edu.cn/"
                }
                ...
                ]
    }
]

更新

我的尝试(在Python 2.7.11中)基于answer provided by downshift,但它没有按预期工作,我得到以下typeError:

{
"unilist" : {
        "United States" : [
          {"name" : "The College of New Jersey", "web_page" : "http://www.tcnj.edu"},
          {"name" : "Abilene Christian University", "web_page" : "http://www.acu.edu/"},
          {"name" : "Adelphi University", "web_page" : "http://www.adelphi.edu/"}
        ],
        "China" : [
          {"name" : "Harbin Medical University", "web_page" : "http://www.hrbnu.edu.cn/"}
        ],
        ...
    }
}


类型错误:

from collections import defaultdict
import json
from pprint import pprint

with open('old_list.json') as orig_json:    
    newlist = defaultdict(list)

for country in orig_json[0]['unilist']:
    newlist[country['country']].append({'name': country['name'], 'web_page': country['web_page']})

with open('new_list.json', 'w') as fp:
            json.dump(newlist,fp)


pprint.pprint(dict(newlist))

1 个答案:

答案 0 :(得分:3)

这会产生几乎相同的目标输出,只是它缺少"unilist"键。但至少它按国家分组:

import json
from collections import defaultdict

with open('original.json', 'r') as original:
    orig_json = original.read()[1:-1] # Remove outermost list brackets([]) to enable parsing data as JSON data, not a list

oj = json.loads(orig_json)

newlist = defaultdict(list)

for country in oj['unilist']:
    newlist[country['country']].append({'name': country['name'], 
                                        'web_page': country['web_page']})

with open('new.json', 'w') as outfile:
    json.dump(newlist, outfile)

这会将newlist保存到json文件'newlist.json'

输出:

{'China': [{'name': 'Harbin Medical University',
            'web_page': 'http://www.hrbmu.edu.cn/'},
           {'name': 'Harbin Normal University',
            'web_page': 'http://www.hrbnu.edu.cn/'}],
 'United States': [{'name': 'The College of New Jersey',
                    'web_page': 'http://www.tcnj.edu'},
                   {'name': 'Abilene Christian University',
                    'web_page': 'http://www.acu.edu/'},
                   {'name': 'Adelphi University',
                    'web_page': 'http://www.adelphi.edu/'}]}

如果我找到更好的方法来获得确切的目标输出,我会更新这个答案。与此同时,我希望这会对你有所帮助。