为什么我的PyMC3层次模型中出现尺寸不匹配?

时间:2018-10-05 21:46:49

标签: pymc3 hierarchical-bayesian

这本质上是 Doing Bayesian Data Analysis,第二版(DBDA2)中的“来自多个铸币厂/棒球运动员的多种硬币”示例。我相信我有PyMC3代码,这些代码在功能上是等效的,但是一个有效,而另一个无效。这是PyMC 3.5版。更详细地,

假设我有以下数据。每行都是一个观察值:

observations_dict = {
    'mint': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
    'coin': [0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 7],
    'outcome': [1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1]
}
observations = pd.DataFrame(observations_dict)
observations

一枚薄荷,几枚硬币

以下实现了DBDA2图9.7的程序运行正常:

Fig 9.7

num_coins = observations['coin'].nunique()
coin_idx = observations['coin']

with pm.Model() as hierarchical_model:
    # mint is characterized by omega and kappa
    omega = pm.Beta('omega', 1., 1.)
    kappa_minus2 = pm.Gamma('kappa_minus2', 0.01, 0.01)
    kappa = pm.Deterministic('kappa', kappa_minus2 + 2)

    # each coin is described by a theta
    theta = pm.Beta('theta', alpha=omega*(kappa-2)+1, beta=(1-omega)*(kappa-2)+1, shape=num_coins)

    # define the likelihood
    y = pm.Bernoulli('y', theta[coin_idx], observed=observations['outcome'])  

很多薄荷,很多硬币

但是,一旦将其转换为分层模型(如DBDA2图9.13所示):

Fig 9.13

num_mints = observations['mint'].nunique()
mint_idx = observations['mint']
num_coins = observations['coin'].nunique()
coin_idx = observations['coin']

with pm.Model() as hierarchical_model2:
    # Hyper parameters
    omega = pm.Beta('omega', 1, 1)
    kappa_minus2 = pm.Gamma('kappa_minus2', 0.01, 0.01)
    kappa = pm.Deterministic('kappa', kappa_minus2 + 2)

    # Parameters for mints
    omega_c = pm.Beta('omega_c',
                       omega*(kappa-2)+1, (1-omega)*(kappa-2)+1,
                       shape = num_mints)    
    kappa_c_minus2 = pm.Gamma('kappa_c_minus2',
                              0.01, 0.01,
                              shape = num_mints)
    kappa_c = pm.Deterministic('kappa_c', kappa_c_minus2 + 2)

    # Parameters for coins
    theta = pm.Beta('theta',
                     omega_c[mint_idx]*(kappa_c[mint_idx]-2)+1,
                    (1-omega_c[mint_idx])*(kappa_c[mint_idx]-2)+1,
                     shape = num_coins)

    y2 = pm.Bernoulli('y2', p=theta[coin_idx], observed=observations['outcome'])

错误是:

ValueError: operands could not be broadcast together with shapes (8,) (20,) 

该模型有8个硬币的8个θ,但看到了20行数据。

但是,如果对数据进行分组,使得每一行代表单个硬币的最终统计信息,如下所示:

grouped = observations.groupby(['mint', 'coin']).agg({'outcome': [np.sum, np.size]}).reset_index()
grouped.columns = ['mint', 'coin', 'heads', 'total']

然后将最终似然变量更改为二项式,如下所示

num_mints = grouped['mint'].nunique()
mint_idx = grouped['mint']
num_coins = grouped['coin'].nunique()
coin_idx = grouped['coin']

with pm.Model() as hierarchical_model2:
    # Hyper parameters
    omega = pm.Beta('omega', 1, 1)
    kappa_minus2 = pm.Gamma('kappa_minus2', 0.01, 0.01)
    kappa = pm.Deterministic('kappa', kappa_minus2 + 2)

    # Parameters for mints
    omega_c = pm.Beta('omega_c',
                       omega*(kappa-2)+1, (1-omega)*(kappa-2)+1,
                       shape = num_mints)    
    kappa_c_minus2 = pm.Gamma('kappa_c_minus2',
                              0.01, 0.01,
                              shape = num_mints)
    kappa_c = pm.Deterministic('kappa_c', kappa_c_minus2 + 2)

    # Parameter for coins
    theta = pm.Beta('theta',
                     omega_c[mint_idx]*(kappa_c[mint_idx]-2)+1,
                    (1-omega_c[mint_idx])*(kappa_c[mint_idx]-2)+1,
                     shape = num_coins)

    y2 = pm.Binomial('y2', n=grouped['total'], p=theta, observed=grouped['heads'])

一切正常。现在,后一种形式更有效,并且通常被首选,但是我相信前一种形式也应该起作用。因此,我认为这主要是PyMC3问题(甚至更有可能是用户错误)。

引用DBDA版本1,

  

“ BUGS模型使用二项式似然分布来求和   正确,而不是对个人使用伯努利分布   审判。使用二项式只是为了方便起见   该程序。如果将数据指定为逐项试验结果   而不是完全正确,则模型可以包含一个   试用循环并使用伯努利似然函数”

让我困扰的是,在第一个示例(一个薄荷糖,多个硬币)中,PyMC3看起来可以处理单个观察值,而不是汇总观察值就可以了。因此,我认为第一种形式应该可以,但不能。

代码

http://nbviewer.jupyter.org/github/JWarmenhoven/DBDA-python/blob/master/Notebooks/Chapter%209.ipynb

参考文献

PyMC3 - Differences in ways observations are passed to model -> difference in results?

https://discourse.pymc.io/t/pymc3-differences-in-ways-observations-are-passed-to-model-difference-in-results/501

http://www.databozo.com/deep-in-the-weeds-complex-hierarchical-models-in-pymc3

https://stats.stackexchange.com/questions/157521/is-this-correct-hierarchical-bernoulli-model

1 个答案:

答案 0 :(得分:1)

mint_idx的长度为20(每个观察值一个),但是应该为8(每个硬币一个)。

有效的答案,请注意mint_idx的重新计算(其余保持不变):

grouped = observations.groupby(['mint', 'coin']).agg({'outcome': [np.sum, np.size]}).reset_index()
grouped.columns = ['mint', 'coin', 'heads', 'total']

num_mints = grouped['mint'].nunique()
mint_idx = grouped['mint']
num_coins = observations['coin'].nunique()
coin_idx = observations['coin']

with pm.Model() as hierarchical_model2:
    # Hyper parameters
    omega = pm.Beta('omega', 1, 1)
    kappa_minus2 = pm.Gamma('kappa_minus2', 0.01, 0.01)
    kappa = pm.Deterministic('kappa', kappa_minus2 + 2)

    # Parameters for mints
    omega_c = pm.Beta('omega_c',
                       omega*(kappa-2)+1, (1-omega)*(kappa-2)+1,
                       shape = num_mints)    
    kappa_c_minus2 = pm.Gamma('kappa_c_minus2',
                              0.01, 0.01,
                              shape = num_mints)
    kappa_c = pm.Deterministic('kappa_c', kappa_c_minus2 + 2)

    # Parameters for coins
    theta = pm.Beta('theta',
                     omega_c[mint_idx]*(kappa_c[mint_idx]-2)+1,
                    (1-omega_c[mint_idx])*(kappa_c[mint_idx]-2)+1,
                     shape = num_coins)

    y2 = pm.Bernoulli('y2', p=theta[coin_idx], observed=observations['outcome'])

非常感谢@junpenglao! https://discourse.pymc.io/t/why-cant-i-use-a-bernoulli-as-a-likelihood-variable-in-a-hierarchical-model-in-pymc3/2022/2