Question

我注意到如果在生成伪随机序列时使用另一个伪随机数生成器，则种子序列会受到干扰。我的问题是，它有什么关系吗？你能以某种方式确保原始种子序列继续吗？让我举个例子;

一个简单的for循环，用于打印从正态分布中绘制的伪随机数：

set.seed(145)
for (i in 1:10){
  print(rnorm(1,0,1))
}

其中给出了以下输出：

[1] 0.6869129
[1] 1.066363
[1] 0.5367006
[1] 1.906029
[1] 1.06316
[1] 1.370344
[1] 0.5277918
[1] 0.4030967
[1] 1.167752
[1] 0.7926794

接下来，如果迭代器等于5，我们从均匀分布中引入伪随机抽取。

set.seed(145)
for (i in 1:10){
  print(rnorm(1,0,1))
  if (i == 5){
    print(runif(1,0,1))
  }
}

其中给出以下输出（在以下输出中，星号标记来自均匀分布的伪随机抽取）：

[1] 0.6869129
[1] 1.066363
[1] 0.5367006
[1] 1.906029
[1] 1.06316
[1] 0.9147102*
[1] -1.508828
[1] -0.03101992 
[1] -1.091504
[1] 0.2442405
[1] -0.6103299

我试图寻求答案的是，是否可以继续set.seed（145）引入的原始种子序列，从而获得以下输出：

[1] 0.6869129
[1] 1.066363
[1] 0.5367006
[1] 1.906029
[1] 1.06316
[1] 0.9147102*
[1] 1.370344
[1] 0.5277918
[1] 0.4030967
[1] 1.167752
[1] 0.7926794

每一项意见都受到高度赞赏，尤其是对这一特定问题的一些文献的参考。

修改

根据Rui Barradas的输入，我尝试在我自己的功能中实现它，但没有运气。除了for循环的每次迭代中的rnorm采样之外，if-statement中的for循环期望不应该有任何其他随机性，这应该由Rui的修复处理。但不幸的是，似乎存在一些干扰种子序列的事情，因为下面的两个函数不会返回相同的值，并且除了如何绘制随机性（通常在AR-1等式中的ε）之外它们是相等的。

tt <- rnorm(500,0,1)*10 

test1 <- function(y, x0=1, n,qsigma = 3, alpha = 5, beta = 20, limit = 0.30){
  t <- length(y)
  gama <- (alpha + beta)/2
  x <- matrix(0,n,t)
  x[, 1] <- rep(x0,n)
  for(s in 2:t) {
    x[, s] <-pmax(alpha*(x[,s-1]<=gama) +beta*(x[,s-1]>gama)+rnorm(n,0,qsigma),1)
    if (s==250) {
      current <- .GlobalEnv$.Random.seed
      resamp <- sample(n, n, replace = TRUE)
      x[,s] <- x[resamp,s]
      .GlobalEnv$.Random.seed <- current
      }
  }
  list(x = x)
}

test3 <- function(y, x0=1, n,qsigma = 3, alpha = 5, beta = 20, limit = 0.30) {
  t <- length(y)
  gama <- (alpha + beta)/2
  x <- matrix(0,n,t)
  x[, 1] <- rep(x0,n)
  e_4 <- matrix(rnorm(n * (t), 0, qsigma),n, (t))

  for(s in 2:t) {
    x[, s] <-pmax(alpha*(x[,s-1]<=gama) +beta*(x[,s-1]>gama)+e_4[,(s-1)],1)
    if (s==250) {resamp <-sample(n, n, replace = TRUE)
      x[,s] <- x[resamp,s]
    }
  }
  list(x = x, pp = e_4)
}

set.seed(123)
dej11 <- test3(y = tt, n = 5000)$x
set.seed(123)
dej21 <- test1(y = tt, n = 5000)$x
all.equal(dej11,dej21)

我确实希望上面的内容最后返回 True ，而不是告诉我平均相对差异为1.186448 的消息。

Answer 1

系统变量.Random.seed存储rng的状态。来自help(".Random.seed")：

.Random.seed是一个整数向量，包含随机数 R中随机数生成的生成器（RNG）状态。可以保存和恢复，但不应由用户更改。

以下是有效的。

set.seed(145)
for (i in 1:10){
  print(rnorm(1,0,1))
  if (i == 5){
    current <- .Random.seed
    print(runif(1,0,1))
    .Random.seed <- current
  }
}

请注意，您应该仔细阅读该帮助页面，特别是Note部分。

至于如何使这个技巧在函数内部工作，问题似乎是函数创建自己的环境。 .Random.seed中存在.GlobalEnv。因此，需要进行以下更改：改为使用.GlobalEnv$.Random.seed。

set.seed(145)

f <- function() {
    for (i in 1:10) {
        print(rnorm(1, 0, 1))
        if (i == 5) {
            current <- .GlobalEnv$.Random.seed
            print(runif(1, 0, 1))
            .GlobalEnv$.Random.seed <- current
        }
    }
}

f()
#[1] 0.6869129
#[1] 1.066363
#[1] 0.5367006
#[1] 1.906029
#[1] 1.06316
#[1] 0.9147102
#[1] 1.370344
#[1] 0.5277918
#[1] 0.4030967
#[1] 1.167752
#[1] 0.7926794

Answer 2

可能有更好的方法，但您可以预先计算随机值，然后在需要新值时引用该列表。以下将把它放入函数表单中。您需要指定一个大于您最终需要的缓冲区。这种方法的一个缺点是您需要提前指定函数的随机函数和参数。从理论上讲，你可以使用逆变换采样，只是从统一分布中生成值来解决这个问题，但我会将其作为读者的练习......

random_seed_fixed <- function(rfun, seed, buffer = 1000000, ...){
  set.seed(seed)
  values <- rfun(buffer, ...)
  next_index <- 1

  out <- function(n){
    new_index <- next_index + n
    # Give an error if we're going to exceed the bounds of our values
    stopifnot(new_index < buffer)

    id <- seq(next_index, new_index - 1, by = 1)
    next_index <<- new_index
    ans <- values[id]
    return(ans)
  }

  return(out)
}

以及如何使用它的一个例子......

> my_rnorm <- random_seed_fixed(rnorm, seed = 642, mean = 17, sd = 2.3)
> 
> my_rnorm(5)
[1] 18.53370 16.16721 15.43144 16.67967 18.27675
> my_rnorm(5)
[1] 19.26933 17.50994 18.90019 14.80153 18.18837
> 
> my_rnorm <- random_seed_fixed(rnorm, seed = 642, mean = 17, sd = 2.3)
> my_rnorm(5) # matches the previous first call of my_rnorm(5)
[1] 18.53370 16.16721 15.43144 16.67967 18.27675
> rnorm(1, 0, 1)
[1] 2.515765
> my_rnorm(5) # Still matches the previous second call of my_rnorm(5)
[1] 19.26933 17.50994 18.90019 14.80153 18.18837

伪随机序列受另一个伪随机生成器干扰

2 个答案: