将分类变量与Python中的另一个分类变量进行比较

时间:2016-09-23 14:06:26

标签: python matplotlib plot categorical-data

在Python中绘制分类变量与另一个分类变量的最佳方法是什么?想象一下,我们有“男性”和“女性”,而另一方面,我们有“付费”和“无偿”。如何在python中绘制一个有意义且易于理解的数字,描述有关男性和女性的信息,以及他们是否支付了贷款。

1 个答案:

答案 0 :(得分:0)

可以使用以下类型的堆积条形图: enter image description here

上述堆积条形图的代码:

import pandas as pd
import matplotlib.pyplot as plt

raw_data = {'genders': ['Male', 'Female'],
        'Paid': [40, 60],
        'Unpaid': [60, 40]}

df = pd.DataFrame(raw_data, columns = ['genders', 'Paid', 'Unpaid'])


# Create the general blog and the "subplots" i.e. the bars

f, ax1 = plt.subplots(1, figsize=(12,8))

# Set the bar width
bar_width = 0.75

# positions of the left bar-boundaries
bar_l = [i+1 for i in range(len(df['Paid']))] 

# positions of the x-axis ticks (center of the bars as bar labels)
tick_pos = [i+(bar_width/2) for i in bar_l] 

# Create a bar plot, in position bar_1
ax1.bar(bar_l, 
        # using the pre_score data
        df['Paid'], 
        # set the width
        width=bar_width,
        # with the label pre score
        label='Paid', 
        # with alpha 0.5
        alpha=0.5, 
        # with color
        color='#F4561D')

# Create a bar plot, in position bar_1
ax1.bar(bar_l, 
        # using the mid_score data
        df['Unpaid'], 
        # set the width
        width=bar_width,
        # with pre_score on the bottom
        bottom=df['Paid'], 
        # with the label mid score
        label='Unpaid', 
        # with alpha 0.5
        alpha=0.5, 
        # with color
        color='#F1911E')


# set the x ticks with names
plt.xticks(tick_pos, df['genders'])

# Set the label and legends
ax1.set_ylabel("Proportion")
ax1.set_xlabel("Genders")
plt.legend(loc='upper left')

# Set a buffer around the edge
plt.xlim([min(tick_pos)-bar_width, max(tick_pos)+bar_width])