Question

我创建了一个截断的指数分布：

from scipy.stats import truncexpon
truncexp = truncexpon(b = 8)

现在，我想从此分布中采样8个点，以使其均值约为4。最好的方法是什么，而不会造成巨大的循环来随机采样直到均值足够接近？

Answer 1

平均值是您分布的特征。如果继续采样值，经验均值将越来越接近分析均值。

Scipy可以告诉您截断指数的平均值：

b = 8
truncexp = truncexpon(b)
truncexp.mean() # 0.99731539839326999

您可以使用分布来采样并计算经验均值：

num_samples = 100000
np.mean(truncexp.rvs(num_samples)) # 0.99465816346645264

一个计算公式的平均值是（第二行）：

b = np.linspace(0.1, 20, 100)
m = 1/ ((1 - np.exp(-b)) / ((1 - (b + 1)*np.exp(-b))))

如果对此进行绘制，则可以看到平均值对不同b值的表现。

对于b-> inf，均值将接近1。您将找不到均值为4的b。

如果要从平均值为4的截断指数中采样，则可以简单地缩放采样。这不会给您原始分布的样本，但是再次，原始分布的样本将永远不会给您平均值4。

truncexp.rvs(num_samples) * 4 / truncexp.mean()

Answer 2

truncexpon分布具有三个参数：形状b，位置loc和比例尺scale。发行版的支持为[x1, x2]，其中x1 = loc和x2 = shape*scale + loc。对shape求解后一个方程，得到shape = (x2 - x1)/scale。我们将选择scale参数，以使分布的均值为4。为此，我们可以将scipy.optimize.fsolve应用于当truncexpon.mean((x2 - x1)/scale, loc, scale)为4时标度为零的函数

这是一个简短的脚本来演示：

import numpy as np
from scipy.optimize import fsolve
from scipy.stats import truncexpon


def func(scale, desired_mean, x1, x2):
    return truncexpon.mean((x2 - x1)/scale, loc=x1, scale=scale) - desired_mean


x1 = 1
x2 = 9

desired_mean = 4.0

# Numerically solve for the scale parameter of the truncexpon distribution
# with support [x1, x2] for which the expected mean is desired_mean.
scale_guess = 2.0
scale = fsolve(func, scale_guess, args=(desired_mean, x1, x2))[0]

# This is the shape parameter of the desired truncexpon distribution.
shape = (x2 - x1)/scale

print("Expected mean of the distribution is %6.3f" %
      (truncexpon.mean(shape, loc=x1, scale=scale),))
print("Expected standard deviation of the distribution is %6.3f" %
      (truncexpon.std(shape, loc=x1, scale=scale),))

# Generate a sample of size 8, and compute its mean.
sample = truncexpon.rvs(shape, loc=x1, scale=scale, size=8)
print("Mean of the sample of size %d is %6.3f" %
      (len(sample), sample.mean(),))

bigsample = truncexpon.rvs(shape, loc=x1, scale=scale, size=100000)
print("Mean of the sample of size %d is %6.3f" %
      (len(bigsample), bigsample.mean(),))

典型输出：

Expected mean of the distribution is  4.000
Expected standard deviation of the distribution is  2.178
Mean of the sample of size 8 is  4.694
Mean of the sample of size 100000 is  4.002

具有特定平均值的截断指数分布的点的样本向量

2 个答案: