pymc3中NUTS的推断不正确

时间:2016-02-02 19:43:00

标签: pymc3

对于一个非常简单的连续玩具模型,我使用NUTS得到了pymc3中看起来不正确的后部。后验不同意分析计算和大都会后验。

在下面的代码中,我使用固定的随机种子生成合成数据(因此结果是可重现的)。然后我在pymc3中定义相同的生成模型,仅观察最终数据。最后,我将其中一个潜在变量的边际分布与真正的分析后验和Metropolis后验进行了比较。结果不一致。

#!/usr/bin/env python2
from __future__ import division

import numpy as np
import pymc3 as mc
import scipy as sci
import theano.tensor as th

np.random.seed(13)

n = 10
tau_scale = 2
tau0 = sci.stats.expon.rvs() * tau_scale
mu0 = np.random.randn(n) / np.sqrt(tau0)
x0 = mu0 + np.random.randn(n)

with mc.Model() as model1:
    tau = mc.Exponential('tau', lam=1 / tau_scale)
    mu = mc.Normal('mu', tau=tau, shape=(n,))
    mc.Normal('x', mu=mu, observed=x0)

with mc.Model() as model2:
    tau = mc.Exponential('tau', lam=1 / tau_scale)
    mu_z = mc.Normal('mu_z', shape=(n,))
    mu = mc.Deterministic('mu', mu_z / th.sqrt(tau))
    mc.Normal('x', mu=mu, observed=x0)


def infer(model):
    with model:
        map_ = mc.find_MAP(fmin=sci.optimize.fmin_l_bfgs_b)
        step = mc.NUTS(scaling=map_)
        trace = mc.sample(100, step=step, start=map_, progressbar=False)
        step = mc.NUTS(scaling=trace[-1])
        return mc.sample(11000, step=step, start=trace[-1], progressbar=False)

trace1 = infer(model1)
trace2 = infer(model2)

with model2:
    trace3 = mc.sample(100000, step=mc.Metropolis(), progressbar=False,
                       start=mc.find_MAP(fmin=sci.optimize.fmin_l_bfgs_b))

samples_tau1 = trace1['tau'][1000:]
samples_tau2 = trace2['tau'][1000:]
samples_tau3 = trace3['tau'][10000:]

print
print 'pymc3 version: ' + mc.__version__
print
print 'Model 1 NUTS tau'
print 'Mean: {0:3.1f}'.format(samples_tau1.mean())
print 'Standard Deviation: {0:3.1f}'.format(samples_tau1.std())
print 'Median {0:3.1f}'.format(np.percentile(samples_tau1, 50))
print
print 'Model 2 NUTS tau'
print 'Mean: {0:3.1f}'.format(samples_tau2.mean())
print 'Standard Deviation: {0:3.1f}'.format(samples_tau2.std())
print 'Median {0:3.1f}'.format(np.percentile(samples_tau2, 50))
print
print 'Model 2 Metropolis tau'
print 'Mean: {0:3.1f}'.format(samples_tau3.mean())
print 'Standard Deviation: {0:3.1f}'.format(samples_tau3.std())
print 'Median {0:3.1f}'.format(np.percentile(samples_tau3, 50))

我实际上以两种稍微不同的方式定义了相同的生成模型。上述程序的输出如下:

deepee@entropy:~$ ./test_inference.py 
Applied log-transform to tau and added transformed tau_log to model.
Applied log-transform to tau and added transformed tau_log to model.

pymc3 version 3.0

Model 1 tau
Mean: 2.5
Standard Deviation: 1.6
Median 2.1

Model 2 tau
Mean: 4.0
Standard Deviation: 2.5
Median 3.4

Model 2 Metropolis tau
Mean: 3.5
Standard Deviation: 2.3
Median 2.9

tau的真实后验平均值为3.5,标准差为2.3,中值为3.0,与Metropolis一致。使用Stan,这些值也更紧密地匹配。我正在使用pymc3的一个相对较新的提交(ca40cd3b2)。

0 个答案:

没有答案