Question

假设我有一个vector<int> positions代表我想要分组的位置，以及两个Rcpp::NumericVector向量A和B我想要分组（两者都可以也被视为vector<double>）。

计算R中我写的内容的最佳方法是什么？ sum(A[positions])（double）或A[positions] / B[positions]（a vector[double]）？基本上，如果我不需要复制（或for循环），我想在某些位置访问向量的元素。

R中的示例：

positions = c(2,4,5) # just a vector with positions
A = rnorm(100) # a vector with 100 random numbers
B = rnorm(100)

mysum <- sum(A[positions])
mysmallvector <- A[positions] / B[positions] # or (A/B)[positions]

现在我只是循环遍历positions的所有值，并逐个按位置对向量进行子集化，但我无法帮助您思考更优雅的解决方案。

Answer 1

因此，在Rcpp中复制R的功能并不一定是理想的。首先，你应该使用Rcpp糖表达式查看subsetting in Rcpp的警告。其次，由于矢量化结构R具有，因此即使在R内也使用for循环。

您可能希望考虑使用RcppArmadillo而不是Rcpp数据类型。这样做的缺点是，当数据被移植到C ++中然后返回到R时，您将产生一个副本命中。使用Rcpp数据类型，您将避免这种情况，但您必须定义自己的操作（请参阅 divide_subset（）） 下方）。

说到这里，我们可以通过Rcpp复制所请求的功能：

#include <Rcpp.h>
using namespace Rcpp;

// Uses sugar index subsets
// [[Rcpp::export]]
NumericVector subset(NumericVector x, IntegerVector idx) {
  return x[idx];
}

// Uses sugar summation function (e.g. a nice for loop)
// [[Rcpp::export]]
double sum_subset(NumericVector x, IntegerVector idx) {
  return sum(subset(x,idx));
}

// No sugar for element-wise division
// [[Rcpp::export]]
NumericVector divide_subset(NumericVector x, NumericVector y, IntegerVector idx) {
  unsigned int n = idx.size();
  NumericVector a(n);
  for(unsigned int i = 0; i < idx.size(); i++){
    a[i] = x[idx[i]]/y[idx[i]];
  }

  return a;
}


/*** R
set.seed(1334)
positions = c(2,4,5) 

# Subtract one from indexes for C++
pos_cpp = positions - 1

A = rnorm(100) # a vector with 100 random numbers
B = rnorm(100)

mysum = sum(A[positions])

cppsum = sum_subset(A, pos_cpp)
all.equal(cppsum, mysum)

mysmallvector = A[positions] / B[positions] # or (A/B)[positions]

cppdivide = divide_subset(A,B, pos_cpp)
all.equal(cppdivide, mysmallvector)
*/

将函数应用于向量子集的最佳方法是什么？

1 个答案: