按日期排列的熊猫直方图,并按类别排序

时间:2016-01-20 17:47:36

标签: python pandas histogram

我很难理解大熊猫中的分组,也能够生成按类别堆叠的直方图。

这是我正在尝试做的一个工作示例。我真的在循环遍历许多文件,每一个我创建一个字典,然后将其附加到包含所有字典的列表中。然后我将其转换为数据帧并将日期字符串转换为datetime对象。

import pandas as pd

# Stand in for dictionaries created by looping over some files
d1={'fruit':'banana','vege':'spinach','date':'August 1, 2014'}
d2={'fruit':'banana','vege':'carrots','date':'August 1, 2014'}
d3={'fruit':'banana','vege':'peas','date':'August 1, 2015'}
d4={'fruit':'orange','vege':'spinach','date':'August 1, 2014'}
d5={'fruit':'orange','vege':'carrots','date':'August 1, 2015'}
data=[d1,d2,d3,d4,d5]

# Create the dataframe, turn the date strings into datetime objects
df=pd.DataFrame(data)
df.date2=pd.to_datetime(df.date) 

# This attempt at plotting gets me a histogram by year, but not divided how it should be.
df.groupby(df.date2.dt.year).count().plot(kind="bar")

结果情节如下所示:

Histogram by year, but unsure why 3 bars for each with category labels

我真正喜欢的是这样的:

Histogram by year, stacked by the text within the category of "fruit"

我尝试了其他各种各样的事情,例如

fr=df.groupby('fruit')

但是fr.plot失败了因为

TypeError: Empty 'DataFrame': no numeric data to plot

提前感谢您的帮助!

2 个答案:

答案 0 :(得分:3)

怎么样:

df.groupby(df.date2.dt.year)['fruit']\
    .value_counts()\
    .unstack(1)\
    .plot(kind='bar', stacked=True)

哪个收益率: enter image description here

答案 1 :(得分:0)

我建议将date用作DateTimeIndex。对于pandas 0.17

df['date'] = pd.to_datetime(df.date).dt.year
df.set_index('date', inplace=True)
df.groupby(level='date').fruit.value_counts().unstack('fruit').plot.bar(stacked=True)

enter image description here