从CSV创建特定格式的JSON文件

时间:2015-05-15 15:04:13

标签: python json csv

只是想道歉,如果我接近JSON文件创建错误,我只是试图拼凑我能做的事情。如果你有一个,请提供更好的建议。这是我的问题:

我尝试从包含3列的CSV创建JSON文件,如下所示:

000024F14CF24E42A5F36D7CB7A07C26,Name One,action-1
000024F14CF24E42A5F36D7CB7A07C26,Name One Variant,action-1
000042F8F69C4A048DDD4770DB7966C8,Name Two,action-2

我需要完成的JSON格式是:

{
"topics": [
    {
        "id": "000024f14cf24e42a5f36d7cb7a07c26", 
        "label": [
            "Name One", 
            "Name One Variant"
        ]
        "meta": {
            "action": "action-1"
        }
    }
    {
        "id": "000042F8F69C4A048DDD4770DB7966C8", 
        "label": [
            "Name Two"
        ]
        "meta": {
            "action": "action-2"
        }
    }
  ]
}

所以基本上我需要将名称组合成一个列表,保留所有变体,如果它们具有相同的ID,我只需要保留一个动作,因为它们对于每个ID总是相同的。

到目前为止,我将在下面粘贴的脚本关闭,但我被卡住了。此脚本输出如下所示的JSON,您可以看到操作已添加到标签数组中。如何将操作分开?:

{
    "topics": [
        {
            "id": "000024f14cf24e42a5f36d7cb7a07c26", 
            "label": [
                "Name One", 
                "action-1", 
                "Name One Variant", 
                "action-1"
            ]
        }
    ]
}

脚本:

import csv
import json
from collections import defaultdict

def convert2json():
    # open the CSV file and loop through each row and append to the uniques list
    uniques = []
    with open('uploads/test.csv','rb') as data_file:
        reader = csv.reader(data_file)
        for row in reader:
            itemids = row[0]
            values = row[1]
            actions = row[2]
            uniques.append((itemids, values, actions))

    # using defaultdict create a list, then loop through uniques and append
    output = defaultdict(list)
    for itemid, value, action in uniques:
        output[itemid].append(value)
        output[itemid].append(action)


    # loop through the defaultdict list and append values to a dictionary
    # then add values with labels to the done list

    done = []
    for out in output.items():
        jsonout = {}
        ids = out[0]
        jsonout['id'] = ids.lower()
        vals = out[1]
        jsonout['label'] = vals
        done.append(jsonout)

    # create a dictionary and add the "done" list to it so it outputs
    # an object with a JSON array named 'topics'
    dones = {}
    dones['topics'] = done

    print json.dumps(dones, indent=4, encoding='latin1')                               

if __name__ == "__main__":
    convert2json()

2 个答案:

答案 0 :(得分:3)

你确实很亲密。我马上就建造了这个结构。第一次看到itemid时,准备它的条目并记住它,随后的时间只需将值添加到标签中。

import csv

summary = {}
with open('test.csv', 'rb') as data_file:
    reader = csv.reader(data_file)
    for itemid, value, action in reader:
        if itemid not in summary:
            summary[itemid] = dict(id=itemid, label=[value], meta={'action': action})
        else:
            summary[itemid]['label'].append(value)

data = {"topics": list(summary.values())}

答案 1 :(得分:2)

改变了一些事情

def convert2json2():
    # open the CSV file and loop through each row and append to the uniques list
    # uniques = []

    topics = dict()

    # new_entry = dict(id)

    with open('uploads/test.csv','rb') as data_file:
        reader = csv.reader(data_file)

        #000024F14CF24E42A5F36D7CB7A07C26,Name One,action-1
        for row in reader:
            #can't use id thats a builtin function, but use all your other final 
            #json attribute names.
            id_ = row[0].lower()
            #you might have had the columns wrong before
            label = row[1]
            action = row[2]
            # uniques.append((itemids, values, actions))


            #skip the unique, a dictionary is already unique
            #populate it with a dictionary made out of your final desired json 
            #field names.  action is always same so populated on first pass
            #ditto id_
            topic = topics.setdefault(id_, dict(
                                                id=id_, 
                                                label=[],
                                                meta=dict(action=action)
                                                ) 
            )


            #after the first insert above, you have an empty label list
            #add to it on each pass...
            topic["label"].append(label)


    # create a dictionary and add the "done" list to it so it outputs
    # an object with a JSON array named 'topics'
    dones = {}

    #nope...
    #dones['topics'] = topics
    dones['topics'] = topics.values()

    print json.dumps(dones, indent=4, encoding='latin1')                               

,输出

{
    "topics": [
        {
            "meta": {
                "action": "action-1"
            }, 
            "id": "000024f14cf24e42a5f36d7cb7a07c26", 
            "label": [
                "Name One", 
                "Name One Variant"
            ]
        }, 
        {
            "meta": {
                "action": "action-2"
            }, 
            "id": "000042f8f69c4a048ddd4770db7966c8", 
            "label": [
                "Name Two"
            ]
        }
    ]
}