Question

我正在尝试计算可变大小数组的原点和偏移量，并将它们存储在字典中。这是我实现这一目标的非pythonic方式。我不确定我是否应该使用map，lambda函数或列表推导来使代码更加pythonic。

基本上，我需要根据总大小来剪切数组的块，并将xstart，ystart，x_number_of_rows_to_read，y_number_of_columns_to_read存储在字典中。总大小是可变的。我无法将整个数组加载到内存中并使用numpy索引，或者我肯定会。原点和偏移用于使数组变成numpy。

intervalx = xsize / xsegment #Get the size of the chunks
intervaly = ysize / ysegment #Get the size of the chunks

#Setup to segment the image storing the start values and key into a dictionary.
xstart = 0
ystart = 0
key = 0

d = defaultdict(list)

for y in xrange(0, ysize, intervaly):
    if y + (intervaly * 2) < ysize:
        numberofrows = intervaly
    else:
        numberofrows = ysize - y

    for x in xrange(0, xsize, intervalx):
        if x + (intervalx * 2) < xsize:
            numberofcolumns = intervalx

        else:
            numberofcolumns = xsize - x
        l = [x,y,numberofcolumns, numberofrows]
        d[key].append(l)
        key += 1
return d

我意识到xrange不适合3端口。

Answer 1

除了使用defaultdict之外，此代码看起来很好。列表似乎是一个更好的数据结构，因为：

您的密钥是顺序的
您正在存储一个列表，其中唯一的元素是您的词典中的另一个列表。

你可以做的一件事：

使用三元运算符（我不确定这是否会有所改进，但代码行数会减少）

以下是您的代码的修改版本，并提供了一些建议。

intervalx = xsize / xsegment #Get the size of the chunks
intervaly = ysize / ysegment #Get the size of the chunks

#Setup to segment the image storing the start values and key into a dictionary.
xstart = 0
ystart = 0

output = []

for y in xrange(0, ysize, intervaly):
    numberofrows = intervaly if y + (intervaly * 2) < ysize else ysize -y
    for x in xrange(0, xsize, intervalx):
        numberofcolumns = intervalx if x + (intervalx * 2) < xsize else xsize -x
        lst = [x, y, numberofcolumns, numberofrows]
        output.append(lst)

        #If it doesn't make any difference to your program, the above 2 lines could read:
        #tple = (x, y, numberofcolumns, numberofrows)
        #output.append(tple)

        #This will be slightly more efficient 
        #(tuple creation is faster than list creation)
        #and less memory hungry.  In other words, if it doesn't need to be a list due
        #to other constraints (e.g. you append to it later), you should make it a tuple.

现在要获取您的数据，您可以offset_list=output[5]代替offset_list=d[5][0]

Answer 2

虽然它不会改变你的算法，但是写一个if / else语句的更加pythonic方式是：

numberofrows = intervaly if y + intervaly * 2 < ysize else ysize - y

而不是：

if y + (intervaly * 2) < ysize:
    numberofrows = intervaly
else:
    numberofrows = ysize - y

（类似于其他if / else语句）。

Answer 3

您是否考虑过使用np.memmap动态加载片段？然后，您只需要动态确定所需的偏移量，而不是分块存储偏移量的数组。

http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html

Answer 4

这是一个很长的班轮：

d = [(x,y,min(x+xinterval,xsize)-x,min(y+yinterval,ysize)-y) for x in 
xrange(0,xsize,xinterval) for y in xrange(0,ysize,yinterval)]

Pythonic方法计算数组的偏移量

4 个答案: