答案 0 :(得分:1)
我想我可以用普通的箱线图来做这件事。根据您希望它有多漂亮,您可以获得接近您想要的东西。我通常使用 seaborn,但它基于 matplotlib,因此如果您愿意,不应该进行太多更改以保持纯 mpl。
我模仿了你的数据集
import pandas as pd
import seaborn as sns
data = [['A','A','B','B','C','C'],
[1, 1.1, 0.7, 1.2, 0.8, 1.1],
[1.5, 1.4, 1.4, 1.7, 1.6, 1.4],
[1.8, 1.8, 2.1, 2.2, 2.4, 1.7]]
df = pd.DataFrame(data).T
df.columns = ["Group","Lower","Average","Upper"]
然后,我将列融合回行,并去除平均值。这留下了最大值和最小值。我不得不添加一个新列作为色调,以便分别绘制每组中的项目。
df = df.melt(id_vars=["Group"],value_vars=["Lower","Average","Upper"])
df = df[df["variable"] != "Average"]
df["hue"] = df.groupby(["Group","variable"]).cumcount()
如果您在 whis 设置为 100 的情况下绘制一个盒子和胡须,那么它会为您提供最大值和最小值,平均值在中间(因为它只有两个数据点,所以它与带有平均值的列相同) .
g = sns.boxplot(x='Group',
y='value',
hue="hue",
data=df,
whis=(0,100),
meanline=True,
showbox=False,
width=0.5)
g.get_legend().remove()
我找不到强制显示垂直线剩余部分的方法。缝隙是盒子所在的地方,但你可以隐藏它,尤其是现在它毫无意义。无论如何,这就是最终结果:
答案 1 :(得分:0)
对于已经建立 CI 的上限和下限的情况,matplotlib 更容易使用,因为它可以为您提供更多操作工具。
我的做法是使用 fill_between 和 alpha 0.1。它是一种在两条线之间绘制区域的方法。下面是一些代码来了解它应该如何工作:
from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np
x = list(range(0, len(df.first_org.unique()))) # matplotlib needs x values to be numerical so you'll need to encode your cat.
# values if they're in text format
y = df[df['ci_value_type'] == 'mean']['event_occurrence_frequency'].values
ucb = df[df['ci_value_type'] == 'mean-high']['event_occurrence_frequency'].values
lcb = df[df['ci_value_type'] == 'mean-low']['event_occurrence_frequency'].values
fig, ax = plt.subplots()
fig.set_figheight(15)
fig.set_figwidth(17)
ax.plot(x,y, marker='o')
# basically, CI magic happens here
# this line of code tells matplotlib to fill a sector between certain lines on plot
ax.fill_between(x, (y - lcb), (y + ucb), color = 'b', alpha = .1)
sns.set_style() # you don't have to add this if you don't want to download seaborn
# added it here so we'll have similar looking plots
ax.set_title('Event occurence with regards to organization mentioned')
ax.grid('on')
plt.xticks(x, list(df.first_org.unique())) # little trick to get textual X labels instead of numerical
plt.xlabel('companies')
plt.ylabel('event frequency')
label_X = ax.xaxis.get_label()
label_Y = ax.yaxis.get_label()
label_X.set_style('italic')
label_X.set_size(12)
label_Y.set_style('italic')
label_Y.set_size(12)
plt.show()
要点链接:https://gist.github.com/theDestI/7bddf368e7e20829cf8f59bccadc073f#file-matplotlib_ci-py
我写了一篇关于在媒体上绘制置信区间的帖子,如果您需要更多信息,请查看:
https://destiq.medium.com/trickycases-2-ci-plot-with-seaborn-and-matplotlib-11cd74a3fc5d