更改表格的结构

时间:2018-12-26 20:37:27

标签: python pandas

redshift中表的当前结构:

year    month   scenario    sourcecountry   destinationcountry  ags_marketplaceid   corestore   category    pg  valuetype   main_country_name   region_rollup   segment_rollup  retail_x    ba_x    fn_x    total_x retail_y    ba_y    fn_y    total_y
2017    FEB A   1   INDIA   234 Store   None    A   Forecast    India   Asia-Pacific    H   935.826 0   0   935.826 21.423  0   0   21.423
2017    MAR A   3   OMAN    3   Core    None    Books   Forecast    ROW Africa / Middle East & Other    C   164.111 0   0   164.111 4   0   0   4

我尝试用相同的方式编写python代码。

-df.setindex('year')#无法设置索引,因为我要在10个以上的字段上设置索引,并将渠道分为零售,ba,fn,总计四部分。不止于此。

df.columns=pd.MultiIndex.from_tuples([tuple(c.split('_',1)) for c in df.columns])
df.stack(0).reset_index(1)

year    month   scenario    sourcecountry   destinationcountry  ags_marketplaceid   corestore   category    pg  valuetype   main_country_name   region_rollup   segment_rollup  Channel x   y
2017    FEB A   1   INDIA   234 Store   None    A   Forecast    India   Asia-Pacific    H   Retail  935.826 21.423
2017    MAR A   3   OMAN    3   Core    None    Books   Forecast    ROW Asia-Pacific    H   BA  164.111 0
2017    MAR A   3   OMAN    3   Core    None    Books   Forecast    ROW Africa / Middle East & Other    H   FN  164.111 0
2017    MAR A   3   OMAN    3   Core    None    Books   Forecast    ROW Africa / Middle East & Other    H   Total   1264.048    21.423

0 个答案:

没有答案