直方图

时间:2019-01-07 08:38:42

标签: python histogram

我想从蒙特卡罗模拟中绘制直方图,但是我得到了错误:太多的值无法解包。

from scipy.stats import lognorm, norm, poisson
from matplotlib  import pyplot as plt
import numpy as np
import pandas as pd

df = pd.read_excel('M3KMU.xlsx')
schaden = df["Schaden"]

# count the number of loss events in given year
fre = df.groupby("Jahr").size()
print(fre)

# estimate lambda parameter
lam = np.sum(fre.values) / 13
print(lam)


# draw random variables from a Poisson distribtion with lambda=lam
prvs = poisson.rvs(lam, size=(10000))

# plot the pdf (loss frequency distribution)
h = plt.hist(prvs, bins=range(0, 11))
plt.close("all")
y = h[0]/np.sum(h[0])
x = h[1]

plt.figure(figsize=(10, 6))
plt.bar(x[:-1], y, width=0.7, align='center', color="#2c97f1")
plt.xlim([-1, 11])
plt.ylim([0, 0.25])
plt.ylabel("Probability", fontsize=12)
plt.title("Loss Frequency Distribution", fontsize=14)
plt.savefig("f01.png")

c = .7, .7, .7  # define grey color

plt.figure(figsize=(10, 6))
plt.hist(df["Schaden"], bins=25, color=c, normed=True)
plt.xlabel("Incurred Loss ($M)", fontsize=12)
plt.ylabel("N", fontsize=12)
plt.title("Loss Severity Distribution", fontsize=14)

x = np.arange(0, 5, 0.01)
sig, loc, scale = lognorm.fit(df["Schaden"])
pdf = lognorm.pdf(x, sig, loc=loc, scale=scale)
plt.plot(x, pdf, 'r')
plt.savefig("f02.png")

print(sig, loc, scale)  # lognormal pdf's parameters

def loss(r, loc, sig, scale, lam):
    X = []
    for x in range(280):  # up to 280 loss events considered
        if(r < poisson.cdf(x, lam)):  # x denotes a loss number
            out = 0
        else:
            out = lognorm.rvs(s=sig, loc=loc, scale=scale)
        X.append(out)
    return np.sum(X)  # = L_1 + L_2 + ... + L_n



losses = []
for _ in range(100):
    r = np.random.random()
    losses.append(loss(r, loc, sig, scale, lam))


h = plt.hist(losses, bins=range(0, 16))
_ = plt.close("all")
y = h[0]/np.sum(h[0])
x = h[1]

plt.figure(figsize=(10, 6))
plt.bar(x[:-1], y, width=0.7, align='center', color="#ff5a19")
plt.xlim('auto')
plt.ylim([0, 0.20])
plt.title("Modelled Loss Distribution", fontsize=14)
plt.xlabel("Loss ($M)", fontsize=12)
plt.ylabel("Probability of Loss", fontsize=12)
plt.savefig("f03.png")

对数正态分布的参数来自数据集的拟合。 我不知道,为什么不能显示这个... 也许是事与愿违,这是否有办法找到给定数据集的最佳拟合分布? (数据集包含年份和这些年份内发生的损失事件……我想估算来年的损失)。

非常感谢您的帮助

0 个答案:

没有答案