Question

Imagine you have a 150 by 5 matrix. Each element contains a random integer from 0 to 20.

Now think each column of the matrix as independent; I need to loop through all the possible combination of all 5 columns, which yields 150^5 = 75937500000 combinations.

It is critical I run every single combination exactly once. The order which I ran combinations do not matter.

I tried doing this using while loops. See code below.

To run this loop, based on my calculation it would take me 54 hours on my laptop.

Questions

Any way to make my code run faster on my laptop? (bootstrapping?)
If not, are there any web R servers I can access that would run my code at a significant faster rate?

If not, would it make any significant difference to run this in another/faster language? (Python)

while(counter1 <= 150)
 {
   while(counter2 <= 150)
  {
    while(counter3 <= 150)
     {
      while(counter4 <= 150)
       {
        while(counter5 <= 150)
       {
      #Other operations that take additional time#
      result<-c(
      giant_matrix[counter1,1], 
      giant_matrix[counter2,2], 
      giant_matrix[counter3,3], 
      giant_matrix[counter4,4], 
      giant_matrix[counter5,5])

      counter5=counter5+1
    }
    counter5=1
    counter4=counter4+1
  }
  counter4=1
  counter3=counter3+1
}
counter3=1
counter2=counter2+1
}
counter2=1
counter1=counter1+1
}

Answer 1

Here is the above solution with a 20 x 5 matrix of 100 elements. It results in a data frame of 3,200,000 x 5 size:

m <- matrix(sample(1:20, 100, replace = TRUE), nrow = 20)
df <- expand.grid(m[, 1], m[, 2], m[, 3], m[, 4], m[, 5])

Example output of the above df (head):

head(df)
  Var1 Var2 Var3 Var4 Var5
1   10   19   13    4    7
2   19   19   13    4    7
3    3   19   13    4    7
4    5   19   13    4    7
5   11   19   13    4    7
6    8   19   13    4    7

nrow(df)
[1] 3200000

dim(df)
[1] 3200000       5

Make my R code run faster (nested loops)

1 个答案: