使用python将多行csv文件合并为一个

时间:2017-08-06 03:45:19

标签: python pandas csv dataframe

我有以下字典:

details = {"Primary_key" : [{'key_1': 'val',  'key_3': "val", 'key_5': ['val_1', 'val_2', 'val_3'], 'key_6': 'val'}, {'key_2': 'val', 'key_3': 'val', 'key_5': ['val_1','val_2'], 'key_6': 'val'}, {'key_1': 'val', 'key_2': 'val', 'key_3': 'val', 'key_4': 'val', 'key_5': ['val_1', 'val_2'], 'key_6': 'val'}] }

我有以下代码将其转换为csv文件。

    import pandas as pd
    for name,val in details.items():
        df = pd.DataFrame.from_dict(details[name])
        df.index = [name]*len(df)
        print df.index
        with open("my_file.csv",'a') as f:
            df.to_csv(f)

key_x是标题,primary_key是名称,val是文本,我得到了以下输出(输出示例)。Output of the code

有没有办法以下列格式获取csv文件?desired format

2 个答案:

答案 0 :(得分:3)

IIUC,你可以做这样的事情创建一个数据帧列表然后使用pd.concat垂直连接它们,Pandas对索引进行内部数据对齐,所以列将匹配你想要的。

pd.concat

答案 1 :(得分:3)

这是import pandas as pd df_list = [] for name,val in details.items(): df = pd.DataFrame.from_dict(details[name]) df.index = [name] * len(df) df_list.append(df) pd.concat(df_list).fillna('').to_csv('my_file.csv') (垂直连接)的一个很好的用例:

df.fillna('')

这还涉及使用NaNThe replica master 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): [...] File "/root/.local/lib/python2.7/site- packages/object_detection/evaluator.py", line 132, in evaluate ignore_groundtruth=eval_config.ignore_groundtruth) File "/root/.local/lib/python2.7/site- packages/object_detection/evaluator.py", line 51, in _extract_prediction_tensors input_dict = create_input_dict_fn() File "/root/.local/lib/python2.7/site- packages/object_detection/builders/input_reader_builder.py", line 61, in build min_after_dequeue=input_reader_config.min_after_dequeue) File "/usr/local/lib/python2.7/dist- packages/tensorflow/contrib/slim/python/slim/data/parallel_reader.py", line 234, in parallel_read reader_kwargs=reader_kwargs).read(filename_queue) File "/usr/local/lib/python2.7/dist- packages/tensorflow/contrib/slim/python/slim/data/parallel_reader.py", line 132, in read enqueue_ops.append(self._common_queue.enqueue(reader.read(queue))) File "/usr/local/lib/python2.7/dist- packages/tensorflow/python/ops/io_ops.py", line 191, in read return gen_io_ops._reader_read_v2(self._reader_ref, queue_ref, name=name) File "/usr/local/lib/python2.7/dist- packages/tensorflow/python/ops/gen_io_ops.py", line 410, in _reader_read_v2 queue_handle=queue_handle, name=name) File "/usr/local/lib/python2.7/dist- packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist- packages/tensorflow/python/framework/ops.py", line 2327, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python2.7/dist- packages/tensorflow/python/framework/ops.py", line 1226, in __init__ self._traceback = _extract_stack() UnimplementedError (see above for traceback): File system scheme sgs not implemented [[Node: parallel_read/ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"] (parallel_read/TFRecordReaderV2, parallel_read/filenames)]] 替换为空字符串,因此它看起来更清晰。