Question

我的目标是将数组分成块，并在 for 循环中遍历这些块。在循环时，我还想打印到目前为止我循环的数据的百分比（因为实际上我将在每个循环上发出请求，这将导致循环需要很长时间......）< /p>

代码如下：

# Function to chunk the data
def chunker(seq, size):
    return (seq[pos:pos + size] for pos in range(0, len(seq), size))

# Init data and chunks
d = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
chunked = chunker(d, 2)

i = 1
for chunk in chunked:
    print(i)
    print(str((i / len(list(chunked)) * 100)) + '%')
    i += 1
    print('end')

如果你运行这段代码，循环只会运行一次。

但是，如果您注释/删除循环内的 print(str((i / len(list(chunked)) * 100)) + '%') 语句，那么它将运行 5 次，这是预期的行为。

为什么我的 print 语句会导致我的循环退出？

Answer 1

chunked 是一个生成器，而不是一个列表，所以你只能迭代它一次。当您调用 list(chunked) 时，它会消耗生成器的其余部分，因此 for 循环没有什么可迭代的了。

此外，len(list(chunked)) 将比您预期的小 1，因为它不包括列表中迭代的当前元素。

更改 chunker 以使用列表推导式而不是返回生成器。

def chunker(seq, size):
    return [seq[pos:pos + size] for pos in range(0, len(seq), size)]

Answer 2

您在调用 list(chunked) 时调用了 print()，这会耗尽 chunked 生成器。 for chunk in chunked 没有得到任何下一个项目，所以它退出。

Answer 3

我似乎也有兴趣显示进度条。如果在迭代之前将生成器转换为列表，则可以使用名为 tqdm 的库。确保不要在 for 循环内打印任何内容，以免终端运行 brr。

在 $pip install tqdm 之后运行此代码

from tqdm import tqdm
import time

def chunker(seq, size):
    return list(seq[pos:pos + size] for pos in range(0, len(seq), size))

# Init data and chunks
d = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
chunked = chunker(d, 2)
print(chunked)
i = 1

for chunk in tqdm(chunked, desc='Iterating on chunked data'):
    time.sleep(.5)

输出：

[[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
Iterating on chunked data: 100%|██████████████████████████| 5/5 [00:02<00:00,  1.96it/s]

打印语句正在退出 for 循环

3 个答案: