Select all possible subarrays of length n

时间:2016-02-03 03:40:27

标签: python arrays numpy

How can a get a 2D array containing all possible consecutive sub-arrays of a certain length?

For example, say my array was ['a', 'b', 'c', 'd', 'e'], and n was 3, the result should be

[['a', 'b', 'c']
 ['b', 'c', 'd']
 ['c', 'd', 'e']]

I found a similar question relating to python lists, however I'd like to do this with numpy, as I need to perform this on many different arrays, each of which are fairly large. Basically, speed is an issue here.

5 个答案:

答案 0 :(得分:2)

第三个也是最后一个无循环的答案:

def substrings(n, x)
  return numpy.fromfunction(lambda i, j: x[i + j], (len(x) - n + 1, n), 
                            dtype=int)

您必须自己分析所有这些解决方案,以找到性能最佳的解决方案。如果您最喜欢其中一种解决方案,请选择正确答案。

答案 1 :(得分:1)

No loops? Okay, we'll use recursion:

def substrings(n, x):
  if len(x) < n:
    return []

  return [x[:n]] + substrings(n, x[1:])

You can easily modify the above to return arrays:

return array([x[:n]] + substrings(n, x[1:]))

Be warned, if the arrays are very large, you'll exceed your maximum recursion depth and the stack will overflow.

答案 2 :(得分:0)

Here's another way to do it without writing loops into your code. Initialize a 3-dimensional array with True values in the diagonal plane i == j + k, and take the matrix-vector product with the array.

from numpy import *

def substrings(n, x):
  A = fromfunction(lambda k, j, i: i == j + k,
                   (len(x) - n + 1, n, len(x)))
  return A.dot(x)

This also suffers from some performance issues, but you might be able to make it better by using one of scipy's sparse matrix classes instead of the dense one provided by numpy.

答案 3 :(得分:0)

由于循环回到桌面上:

>>> n = 3
>>> 
>>> array = ['a', 'b', 'c', 'd', 'e']
>>> 
>>> permutations = (array[j:j + n] for j in range(0, len(array) - (n - 1)))
>>> 
>>> list(permutations)
[['a', 'b', 'c'], ['b', 'c', 'd'], ['c', 'd', 'e']]

这个主题的更紧凑的变体:

permutations = (array[j:][0:n] for j in range(len(array) - n + 1))

答案 4 :(得分:0)

我的(显式)循环解决方案:

>>> n = 3
>>> 
>>> array = ['a', 'b', 'c', 'd', 'e']
>>> 
>>> permutations = zip(*map(lambda x: array[x:], range(n)))
>>> 
>>> list(permutations)
[('a', 'b', 'c'), ('b', 'c', 'd'), ('c', 'd', 'e')]
>>>