Can "if" statements be used in a PyMC deterministic function?

时间:2015-08-14 22:48:13

标签: python statistics pymc pymc3

After reading Cam Davidson-Pilon's Probabilistic Programming & Bayesian Methods for Hackers, I've decided to try my hand at a Hidden Markov Model (HMM) learning problem with PyMC. So far, the code is not cooperating, but through troubleshooting, I feel that I have narrowed down the source of the issue.

Breaking down the code into smaller chunks and focusing on the initial probability and emission probabilities at t=0, I am able learn the emission/observation parameters of a single state at time t=0. However, once I add in another state (for a total of two states), the results of the parameter learning are identical (and incorrect) regardless of data input. So, I feel that I must have done something wrong in the @pm.deterministic portion of the code, which is not allowing me to sample from the Init initial probability function.

With this portion of code, I am aiming to learn the initial probability p_bern and emission probabilities p_0 and p_1 corresponding to states 0 and 1, respectively. The emission is conditional on the state, which is what I am trying to express with my @pm.deterministic function. Can I have the "if" statement in this determinstic function? It seems to be the root of the problem.

# This code is to test the ability to discern between two states with emissions

import numpy as np
import pymc as pm
from matplotlib import pyplot as plt

N = 1000
state = np.zeros(N)
data = np.zeros(shape=N)

# Generate data
for i in range(N):
    state[i] = pm.rbernoulli(p=0.3)
for i in range(N):
    if state[i]==0:
        data[i] = pm.rbernoulli(p=0.4)
    elif state[i]==1:
        data[i] = pm.rbernoulli(p=0.8)

# Prior on probabilities
p_bern = pm.Uniform("p_S", 0., 1.)
p_0 = pm.Uniform("p_0", 0., 1.)
p_1 = pm.Uniform("p_1", 0., 1.)

Init = pm.Bernoulli("Init", p=p_bern) # Bernoulli node

@pm.deterministic
def p_T(Init=Init, p_0=p_0, p_1=p_1, p_bern=p_bern):
    if Init==0:
        return p_0
    elif Init==1:
        return p_1

obs = pm.Bernoulli("obs", p=p_T, value=data, observed=True)
model = pm.Model([obs, p_bern, p_0, p_1])
mcmc = pm.MCMC(model)
mcmc.sample(20000, 10000)
pm.Matplot.plot(mcmc)

I have already attempted the following to no avail:

  1. Use a @pm.potential decorator to create a joint distribution
  2. Changing the placement of my Init location (you can see my comment in the code where I am unsure of where to put it)
  3. Use a @pm.stochastic similar to this

Edit: As per Chris's suggestion, I've moved the Bernoulli node outside of the deterministic. I've also updated the code to a simpler model (Bernoulli observation instead of multinomial) for easier troubleshooting.

Thank you for your time and attention. Any feedback is warmly received. Also, if I am missing any information please let me know!

2 个答案:

答案 0 :(得分:2)

我会将这种随机性从确定性中移开。确定性节点的值应该完全由其父节点的值确定。将一个随机变量隐藏在节点中会违反这一点。

为什么dot创建一个伯努利节点,并将其作为参数传递给确定性的?

答案 1 :(得分:2)

根据您提供的更新信息,以下是一些有效的代码:

fseek()

注意在数据生成步骤中,我使用状态来索引适当的真实概率。我基本上在import numpy as np import pymc as pm from matplotlib import pyplot as plt N = 1000 state = np.zeros(N) data = np.zeros(shape=N) # Generate data state = pm.rbernoulli(p=0.3, size=N) data = [int(pm.rbernoulli(0.8*s or 0.4)) for s in state] # Prior on probabilities p_S = pm.Uniform("p_S", 0., 1.) p_0 = pm.Uniform("p_0", 0., 1.) p_1 = pm.Uniform("p_1", 0., 1.) # Use values of Init as indices to probabilities Init = pm.Bernoulli("Init", p=p_S, size=N) # Bernoulli node p_T = pm.Lambda('p_T', lambda p_0=p_0, p_1=p_1, i=Init: np.array([p_0, p_1])[i.astype(int)]) obs = pm.Bernoulli("obs", p=p_T, value=data, observed=True) model = pm.MCMC(locals()) model.sample(20000, 10000) model.summary() 的规范中做同样的事情。它似乎工作得相当好,但请注意,根据事物初始化的位置,p_Tp_0的两个值最终可能对应于任何一个真值(没有任何约束比另一个大。)因此,p_1的值最终可以作为真实状态概率的补充。