Question

我有一个随机变量X，pdf为f(x)=4xe^-x，其中x>0。

如何从此分布中随机抽取大小为1000的样本？

Answer 1

这是Gamma distribution，其形状为a = 2，比例为1。

q <- rgamma(1000, shape = 2, scale = 1)

Answer 2

一个简单的解决方案是rejection sampling（尽管请参阅我对Severin Pappadeux's answer below的评论）。通用的拒绝采样算法很容易在R中实现（如果没有在几个包中实现，我会感到惊讶，我只是没有查找哪个包会具有这种功能）：< / p>

#' Rejection sampling
#' 
#' @param f A function calculating the density of interest
#' @param g A function giving the proposal density
#' @param rg A function providing random samples from g
#' @param M A numeric vector of length one giving a bound on the the ratio
#'   f(x) / g(x). M must be > 1 and greater than or equal to f(x) / g(x) over
#'   the whole support of X.
#' @param n An integer vector of length one giving the number of samples to
#'   draw; the default is ten thousand.
#' @param ... Further arguments to be passed to g and rg
rejection_sampling <- function(f, g, rg, M, n = 10000, ...) {
    result <- numeric(n)
    for ( i in 1:n ) {
        reject <- TRUE
        while ( reject ) {
            y <- rg(n = 1, ...)
            u <- runif(1)
            if ( u < ( f(y) / (M * g(y, ...)) ) ) {
                result[i] <- y
                reject <- FALSE
            }
        }
    }
    return(result)
}

然后，我们可以使用它从您的分布中获取样本，并绘制样本的密度以及真实的概率密度，以查看其效果如何：

x <- seq(0.01, 15, 0.01)
f <- function(x) 4 * x * exp(-x)
y <- f(x)
set.seed(123)
z <- rejection_sampling(f = f, g = dexp, rg = rexp, M = 10, n = 1e3, rate = 1/4)
dens <- density(z, from = 0.01, to = 15)
scaling_constant <- max(y) / max(dens$y)
plot(x, y, type = "l", xlab = "x", ylab = "f(x)", lty = 2, col = "blue")
lines(dens$x, dens$y * scaling_constant, col = "red", lty = 3)
legend("topright", bty = "n", lty = 2:3, col = c("blue", "red"),
       legend = c("True f(x)", "(Re-scaled) density of sample"))

拒绝抽样的工作原理是从提案分布中抽取样本，如果随机统一偏差大于比率f(x) / M g(x)（其中g(x)是您的提案密度，而M是f(x) / g(x)的边界，如上面Roxygen文档中所述。

我在上面的比率建议分布为1/4的指数提案中使用。您可以使用其他人。

此p.d.f.如Severin Pappadeux's answer所述，它与形状为2且标度为1的伽玛分布成比例。（也就是说，如果将这些参数插入Gamma p.d.f.，您会发现它与您的参数只有一个缩放常数有所不同）。根据您为此目的而定，这可能是更好的方法，也可能是这样。我不确定您的目标是从任意分布（例如）中生成示例，还是从该分布本身中获取示例，等等。通常最好识别是否您任意的pdf实际上是已实现分布的示例，但是如果您不在那种情况下，则需要诸如拒绝采样之类的东西。

为给定的概率分布生成随机样本

2 个答案: