Question

我正在尝试使用Python模拟t-copula，但是我的代码产生了奇怪的结果（行为不正常）：

我遵循Demarta & McNeil（2004）在“ The t Copula and Related Copulas”中提出的方法，该方法指出：

t copula simulation

凭直觉，我知道自由度参数越高，t copula越应类似于高斯型（因此尾部相关性越低）。但是，假设我从scipy.stats.invgamma.rvs或从scipy.stats.chi2.rvs进行采样，则参数s的值越高，参数df的值越高。这没有任何意义，因为我发现有多篇论文指出，对于df-> inf，t-copula-> Gaussian copula。

这是我的代码，我在做什么错？（我是Python fyi的初学者）。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import invgamma, chi2, t

#Define number of sampling points
n_samples = 1000
df = 10

calib_correl_matrix = np.array([[1,0.8,],[0.8,1]]) #I just took a bivariate correlation matrix here
mu = np.zeros(len(calib_correl_matrix))
s = chi2.rvs(df)
#s = invgamma.pdf(df/2,df/2) 
Z = np.random.multivariate_normal(mu, calib_correl_matrix,n_samples)
X = np.sqrt(df/s)*Z #chi-square method
#X = np.sqrt(s)*Z #inverse gamma method

U = t.cdf(X,df)

我的结果与我期望的结果完全相反： df越高，尾部相关性越高，在视觉上也是如此：

 U_pd = pd.DataFrame(U)
 fig = plt.gcf()
 fig.set_size_inches(14.5, 10.5)
 pd.plotting.scatter_matrix(U_pd, figsize=(14,10), diagonal = 'kde')
 plt.show()

df=4： scatter_plot

df=100： enter image description here

直接使用invgamma.rvs时尤其糟糕，即使它们应该产生相同的效果。对于dfs> = 30，我经常会收到ValueError（“ ValueError：数组不能包含infs或NaNs”）

非常感谢您的帮助，非常感谢！

Answer 1

您的代码中存在一个明显的问题。即：

s = chi2.rvs(df)

必须更改为类似的内容：

s = chi2.rvs(df, size=n_samples)[:, np.newaxis]

否则，变量s只是一个常数，而您的X最终是多元正态（由np.sqrt(df/s)缩放）的样本，而不是您的t分布需要。

您很可能仅由于倒霉而获得了{tail_heavy“图表，而您的s采样值最终变得太小。不过，这与df无关，但是似乎更容易在df较小时达到“不幸”值。

用Python模拟t copula

1 个答案: