Question

我有一个关于对数正态分布的问题。我要创建和合并“质量”从10到10 ** 5的对象，这些对象通常是分布的。我以为这将是对数正态分布，所以我开始尝试在python中这样做：

mu, sigma = 3., 1. # mean and standard deviation
s = np.random.lognormal(mu, sigma, 1000)
count, bins, ignored = plt.hist(s, 1000, density=True, align='mid')
x = np.linspace(min(bins), max(bins), 1000)
pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) / (x * sigma * np.sqrt(2 * np.pi)))
plt.plot(x, pdf, linewidth=2, color='r')
plt.xscale('log')
plt.show()

如numpy的示例所示，但是更改mu和sigma并查看绘图，我真的无法确定是否将m和v设置为10 ** 5和1000（遵循下面链接的Wikipedia文章）说给我我想要的东西

我看着https://en.wikipedia.org/wiki/Log-normal_distribution弄清楚了如何计算mu和sigma，但是也许我做错了其他事情。这是解决这个问题的正确方法吗？

我阅读了有关对数正态分布的先前问题/答案，但我认为他们并没有提出相同的问题。抱歉，如果您已经回答了此类问题。

mu，sigma = 3.，1.是示例中给出的，它工作正常，但是当我将mu和sigma更改为以下值时：

m=10**3.5 #where I want the distribution to be centered
v=10000   #the "spread" that I want 
f=1.+(v/m2)
mu=np.log(m/np.sqrt(f))
sigma=np.sqrt(np.log(f))

我没有得到我所期望的..这是一个分布在10 ** 3.5左右，标准为10000的分布。

尝试提出建议：

mu=np.log(3000)
sigma=np.log(10)
s = np.random.lognormal(mu, sigma, 1000)
count, bins, ignored = plt.hist(s, 500, density=True, align='mid')
x = np.linspace(min(bins), max(bins), 1000)
pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) / (x * sigma * np.sqrt(2 * np.pi)))
plt.semilogx(x, pdf, linewidth=2, color='r')

这似乎也不起作用，除非我误解了直方图 histogram

Answer 1

我认为您在解释分布参数方面有困难。

np.random.lognormal的文档在这里： https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.lognormal.html

尤其是，平均值不是mu或10**mu，而是exp(mu)，因此给定的分布平均值为e**3 ≈ 20。

您似乎希望平均值为1000，因此将mu和sigma设置为

mu, sigma  = np.log(1000), np.log(10)`

将生成您期望的分布。

Answer 2

如果您知道需要1000个以对数正态分布的值（即log（x）为正态分布），并且您希望数据的范围为10到10 ^ 5，那么您必须一些计算以获得mu和sigma。但是您需要插入np.random.lognormal的值是相关的基础正态分布的平均值和标准偏差，而不是对数正态分布的不是。您可以从看到的Wikipedia页面上给出的均值和方差公式得出这些信息。

# Parameters
xmax = 10**5
xmin = 10
n = 1000

# Get original mean and variance
# mu: We want normal distribution, so just take the average of the extremes.
# sigma: use the z = (x - mu)/sigma formula and approximation that 
#        the extremes are a deviation of z=3 away.
mu = (xmax + xmin)/2.0
sigma = (xmax - mu)/3.0
m = mu
v = sigma**2

# Get the mean and standard deviation of the underlying normal distribution
norm_mu = np.log(m**2 / np.sqrt(v + m**2))
norm_sigma = np.sqrt((v / m**2)+1)

# Generate random data and an overlying smooth curve
# (This is the same as your code, except I replaced the parameters
# in the 'pdf =' formula.)
s = np.random.lognormal(norm_mu, norm_sigma, n)
count, bins, ignored = plt.hist(s, n, density=True, align='mid')
x = np.linspace(min(bins), max(bins), n)
pdf = (np.exp(-(np.log(x) - norm_mu)**2 / (2 * norm_sigma**2)) / (x * norm_sigma * np.sqrt(2 * np.pi)))
plt.plot(x, pdf, linewidth=2, color='r')
plt.xscale('log')
plt.show()

这就是我得到的。请注意，x轴上的缩放比例呈指数上升，而不是线性上升。这是您要找的东西吗？

对数正态分布

2 个答案: