Question

我有一个有序的矢量，让我们说

v <- c(1, 1, 2, 3, 5, 8, 13, 21, 34)

现在我想找到第一个元素的索引i，它比例如a <- 15更大。

我可以做i <- which(v > a)[1]之类的事情。

但我想利用v已排序的事实，我认为which并不关心这一事实。

我可以自己编写并将这个间隔递归地分成两半并搜索那些部分间隔......

是否有任何内置解决方案？像往常一样，主要问题是速度，我自己的功能肯定会慢一些。

谢谢。

Answer 1

对于速度暴食

a <- 10
v <- sort(runif(1e7,0,1000));
Rcpp::cppFunction('int min_index(NumericVector v, double a) {
                  NumericVector::iterator low=std::lower_bound (v.begin(), v.end(), a);
                  return (low - v.begin());
                  }')
microbenchmark::microbenchmark(which(v > a)[1], min_index(v, a), unit="relative")

#Unit: relative
#            expr      min       lq     mean   median      uq      max neval
#which(v > a)[1] 61299.15 67211.58 14346.42 8797.526 8683.39 11163.27   100
#min_index(v, a)     1.00     1.00     1.00    1.000    1.00     1.00   100

Answer 2

有uniroot。它使用二分法，并且在更长的矢量上更快。

v <- c(1,1,2,3,5,8,13,21,34)
a <- 15

root <- uniroot(f = function(x) v[x] - a, interval = c(1, length(v)))
my_index <- floor(root$root)

Answer 3

只是想知道以下内容是否有用。

Filter(function(x) x > 15, v)[1]
#[1] 21
Find(function(x) x > 15, v, right = FALSE, nomatch = NULL)
#[1] 21
Position(function(x) x > 15, v, right = FALSE, nomatch = NA_integer_)
#[1] 8

Answer 4

which并不是很慢，那么min(which())：

v <- c(1,1,2,3,5,8,13,21,34) 
system.time(
  print(min(which(v > 5)))
)
# [1] 6
# user  system elapsed 
  0       0       0

R：获取排序向量中元素的索引

4 个答案: