Python For 循环的替代更快方法

时间:2021-06-08 01:18:47

标签: python python-3.x loops for-loop

以下代码完全符合我的要求;但是,for 循环太慢了。在我的机器上,for 循环的墙壁时间是 1 分 5 秒。我正在寻找更快的 for 循环的替代方法。

# Imports
from sympy.solvers.solveset import solveset_real
from sympy import Symbol, Eq

# Define variables
initial_value = 1
rate = Symbol('r')
decay_obs_window = 1480346
target_decay = .15

# Solver to calculate decay rate
decay_rate = solveset_real(Eq((initial_value - rate * decay_obs_window), target_decay), rate).args[0]

# Generate weights
weights = []
for i in range(5723673):
    # How to handle data BEYOND decay_obs_window
    if i > decay_obs_window and target_decay == 0:
        # Record a weight of zero
        weights.append(0)
    elif i > decay_obs_window and target_decay > 0:
        # Record the final target weight
        weights.append(decayed_weight)
    # How to handle data WITHIN decay_obs_window
    else:
        # Calculate the new slightly decayed weight
        decayed_weight = 1 - (decay_rate * i)
        weights.append(decayed_weight)

weights[0:10]

我写了这个列表理解,希望能提高执行时间。虽然它工作得很好,但它并没有对 for 循环产生任何明显的运行时改进?:

weights = [0 if i > decay_obs_window and target_decay == 0 else decayed_weight if i > decay_obs_window and target_decay > 0 else (decayed_weight := 1 - (decay_rate * i)) for i in range(len(weights_df))]

我对任何有助于加快速度的方法感兴趣。谢谢?!


最终解决方案:

这是我最终确定的解决方案。在我的机器上,执行整个事情的时间只有 425 毫秒。这是 Aaron 提议的解决方案的略微修改版本。

import numpy as np
from sympy.solvers.solveset import solveset_real
from sympy import Symbol, Eq

# Define variables
initial_value = 1
rate = Symbol('r')
decay_obs_window = 1480346
target_decay = .15

# Instantiate weights array
weights = np.zeros(5723673)

# Solver to calculate decay rate
decay_rate = solveset_real(Eq((initial_value - rate * decay_obs_window), target_decay), rate).args[0]

# Fix a bug where numpy doesn't like sympy floats :(
decay_rate = float(decay_rate)

# How to weight observations WITHIN decay_obs_window
weights[:decay_obs_window + 1] = 1 - np.arange(decay_obs_window + 1) * decay_rate

# How to weight observations BEYOND decay_obs_window
weights[decay_obs_window + 1 : 5723673] = target_decay

weights

3 个答案:

答案 0 :(得分:1)

TLDR;您在 if 语句中测试的任何变量都不会在循环期间发生变化,因此您可以轻松地将条件逻辑踢出循环,并事先决定。我也是 numpy 和矢量化的大力支持者。

从逻辑上看,weights 最终看起来的样子并没有太多可能的结果。正如 RufusVS 所提到的,您可以分离出第一部分,其中没有计算额外的逻辑。它也是一个简单的线性函数,所以为什么不用 numpy 来计算它,这对线性代数非常有用:

import numpy as np
weights = np.zeros(5723673)
#Fix a bug where numpy doesn't like sympy floats :(
decay_rate = float(decay_rate)
weights[:decay_obs_window + 1] = 1 - np.arange(decay_obs_window + 1) * decay_rate

然后您可以根据 target_decay 的值在任何循环之外决定如何处理剩余的值,因为它永远不会改变:

if target_decay == 0:
    pass #weights array started out filled with 0's so we don't need to do anything
elif target_decay > 0:
    #fill the rest of the array with the last value of the window
    weights[decay_obs_window + 1 : 5723673] = weights[decay_obs_window + 1]
    pass
else: #target_decay < 0:
    #continue calculating the linear function
    weights[decay_obs_window + 1 : 5723673] = 1 - np.arange(decay_obs_window + 1, 5723673) * decay_rate

答案 1 :(得分:0)

通过将其分成两个循环,您可以消除与断点的大量比较:

# Imports
from sympy.solvers.solveset import solveset_real
from sympy import Symbol, Eq

# Define variables
initial_value = 1
rate = Symbol('r')
decay_obs_window = 1480346
target_decay = .15

# Solver to calculate decay rate
decay_rate = solveset_real(Eq((initial_value - rate * decay_obs_window), target_decay), rate).args[0]

# Generate weights
weights = []

for i in range(decay_obs_window+1):
    # Calculate the new slightly decayed weight
    decayed_weight = 1 - (decay_rate * i)
    weights.append(decayed_weight)

for i in range(decay_obs_window+1, 5723673):
    # How to handle data BEYOND decay_obs_window
    if target_decay == 0:
        weights.append(0)
    elif target_decay > 0:
        # Record the final target weight
        weights.append(decayed_weight)
    else:
        # Calculate the new slightly decayed weight
        decayed_weight = 1 - (decay_rate * i)
        weights.append(decayed_weight)

weights[0:10]

修改为包含@MarkSouls 评论,以及我自己的进一步观察:

# Imports
from sympy.solvers.solveset import solveset_real
from sympy import Symbol, Eq

# Define variables
initial_value = 1
rate = Symbol('r')
decay_obs_window = 1480346
target_decay = .15

# Solver to calculate decay rate
decay_rate = solveset_real(Eq((initial_value - rate * decay_obs_window), target_decay), rate).args[0]

TOTAL_ENTRIES = 5723673 
# Generate weights
weights = [0]* TOTAL_ENTRIES

for i in range(decay_obs_window+1):
    # Calculate the new slightly decayed weight
    decayed_weight = 1 - (decay_rate * i)
    weights[i]=decayed_weight

if target_decay == 0:
    pass
elif target_decay > 0:
    for i in range(decay_obs_window+1, TOTAL_ENTRIES):
        # Record the final target weight
        weights[i]=decayed_weight
else:
    for i in range(decay_obs_window+1, TOTAL_ENTRIES):
        decayed_weight = 1 - (decay_rate * i)
        weights[i]=decayed_weight

weights[0:10]

答案 2 :(得分:0)

我认为这里有一个绝对最佳的方法:

# Imports
from sympy.solvers.solveset import solveset_real
from sympy import Symbol, Eq

# Define variables
initial_value = 1
rate = Symbol('r')
decay_obs_window = 1480346
target_decay = .15

# Solver to calculate decay rate
decay_rate = solveset_real(Eq((initial_value - rate * decay_obs_window), target_decay), rate).args[0]

# Generate weights
wLength = 5723673
weights = [1 - (decay_rate * i) for i in range(decay_obs_window + 1)]
extend_length = wLength - decay_obs_window - 1
if target_decay == 0:
    weights.extend(0 for _ in range(extend_length))
elif target_decay > 0:
    decayed_weight = weights[-1]
    weights.extend(decayed_weight for _ in range(extend_length))

这将所有分支逻辑带出循环,因此只计算一次而不是大约 150 万次。

也就是说,与您已有的速度相比,这仍然几乎没有提高速度。事实是,您的大部分时间都花在计算 1 - (decay_rate * i) 上,而在 Python 中您无法加快速度。

如果您真的需要更高的性能,您可能需要弄清楚如何调用 C(或 Rust)库。

Numpy 非常适合这个。我们可以使用 fromfunction 方法来创建一个数组。首先导入函数:

from numpy import fromfunction

然后替换

weights = [1 - (decay_rate * i) for i in range(decay_obs_window + 1)]

weights = fromfunction(lambda i: 1 - (decay_rate * i), (decay_obs_window + 1, )).tolist()

这可能代表了您在 Python 中可以做到的最快速度。

相关问题