Question

我想在将其嵌套字典输出到csv之前重新格式化它。我的嵌套字典：

step_list

到目前为止，我已经尝试过：

review = {'Q1': {'Question': 'question wording','Answer': {'Part 1': 'Answer part one', 'Part 2': 'Answer part 2'} ,'Proof': {'Part 1': 'The proof part one', 'Part 2': 'The proof part 2'}},
      'Q2': {'Question': 'question wording','Answer': {'Part 1': 'Answer part one', 'Part 2': 'Answer part 2'} ,'Proof': {'Part 1': 'The proof part one', 'Part 2': 'The proof part 2'}}}

并获得帮助：

my_df = pd.DataFrame(review)
my_df = my_df.unstack()

但我希望它最终看起来像这样：

Q1  Answer      {'Part 1': 'Answer part one', 'Part 2': 'Answe...
    Proof       {'Part 1': 'The proof part one', 'Part 2': 'Th...
    Question                                     question wording
Q2  Answer      {'Part 1': 'Answer part one', 'Part 2': 'Answe...
    Proof       {'Part 1': 'The proof part one', 'Part 2': 'Th...
    Question                                     question wording

所以我需要熔化/解开/枢轴/展开/ other_manipulation_word数据框中的嵌套字典。

我已将此作为指导，但无法将其应用于自己的指导： Expand pandas dataframe column of dict into dataframe columns

Answer 1

这是一种可能的解决方案：

1）使用东方“索引”创建初始DataFrame

df = pd.DataFrame.from_dict(review, orient='index')

2）使用Index.repeat，Series.str.len和DataFrame.loc

创建最终DataFrame的形状

df_new = df.loc[df.index.repeat(df.Answer.str.len())]

3）通过传递给DataFrame的构造函数并使用stack的值来修复“答案”和“证明”列

df_new['Answer'] = pd.DataFrame(df.Answer.tolist()).stack().values
df_new['Proof'] = pd.DataFrame(df.Proof.tolist()).stack().values
print(df_new)

            Question           Answer               Proof
Q1  question wording  Answer part one  The proof part one
Q1  question wording    Answer part 2    The proof part 2
Q2  question wording  Answer part one  The proof part one
Q2  question wording    Answer part 2    The proof part 2

在数据框内展开嵌套字典

1 个答案: