pivoting pandas df - 将列值转换为列名

时间:2017-05-29 22:11:21

标签: pandas pivot-table

我有一个df:

    pd.DataFrame({'time_period': {0: pd.Timestamp('2017-04-01 00:00:00'),
  1: pd.Timestamp('2017-04-01 00:00:00'),
  2: pd.Timestamp('2017-03-01 00:00:00'),
  3: pd.Timestamp('2017-03-01 00:00:00')},
 'cost1': {0: 142.62999999999994,
  1: 131.97000000000003,
  2: 142.62999999999994,
  3: 131.97000000000003},
 'revenue1': {0: 56,
  1: 113.14999999999998,
  2: 177,
  3: 99},
 'cost2': {0: 309.85000000000002,
  1: 258.25,
  2: 309.85000000000002,
  3: 258.25},
 'revenue2': {0: 4.5,
  1: 299.63,2: 309.85,
  3: 258.25},
 'City': {0: 'Boston',
  1: 'New York',2: 'Boston',
  3: 'New York'}})

我想重新构建这个df,以便分别收入和成本:

    pd.DataFrame({'City': {0: 'Boston', 1: 'New York'},
 'Apr-17 revenue1': {0: 56.0, 1: 113.15000000000001},
 'Apr-17 revenue2': {0: 4.5, 1: 299.63},
 'Mar-17 revenue1': {0: 177, 1: 99},
 'Mar-17 revenue2': {0: 309.85000000000002, 1: 258.25}})

与成本相似的df。

基本上,将time_period列值转换为列名,例如4月17日,3月17日,适当的收入/费用字符串以及收入1 /收入2和费用1 /费用2的值。

我一直在玩pd.pivot_table取得了一些成功,但我无法得到我想要的东西。

1 个答案:

答案 0 :(得分:2)

使用set_index和unstack

import datetime as dt
df['time_period'] = df['time_period'].apply(lambda x: dt.datetime.strftime(x,'%b-%Y'))

df = df.set_index(['A', 'B', 'time_period'])[['revenue1', 'revenue2']].unstack().reset_index()
df.columns = df.columns.map(' '.join)


    A           B       revenue1 Apr-2017   revenue1 Mar-2017   revenue2 Apr-2017   revenue2 Mar-2017
0   Boston      Orlando 56.00               177.0               4.50                309.85
1   New York    Dallas  113.15              99.0                299.63              258.25