Python脚本在进展时会变慢吗?

时间:2014-02-23 18:28:55

标签: python performance iteration

我有一个具有这种基本结构的模拟运行:

from time import time

def CSV(*args):
    #write * args to .CSV file
    return

def timeleft(a,L,period):
    print(#details on how long last period took, ETA#)

for L in range(0,6,4):
    for a in range(1,100):
        timeA = time()

            for t in range(1,1000):

                ## Manufacturer in Supply Chain ##

                inventory_accounting_lists.append(#simple calculations#)

                    # Simulation to determine the optimal B-value (Basestock level)

                    for B in range(1,100):
                        for tau in range(1,1000):
                                ## simple inventory accounting operations##

                ## Distributor in Supply Chain ##

                inventory_accounting_lists.append(#simple calculations#)

                    # Simulation to determine the optimal B-value (Basestock level)

                    for B in range(1,100):
                        for tau in range(1,1000):
                                ## simple inventory accounting operations##

                ## Wholesaler in Supply Chain ##

                inventory_accounting_lists.append(#simple calculations#)

                    # Simulation to determine the optimal B-value (Basestock level)

                    for B in range(1,100):
                        for tau in range(1,1000):
                                ## simple inventory accounting operations##

                ## Retailer in Supply Chain ##

                inventory_accounting_lists.append(#simple calculations#)

                    # Simulation to determine the optimal B-value (Basestock level)

                    for B in range(1,100):
                        for tau in range(1,1000):
                                ## simple inventory accounting operations##


        CSV(Simulation_Results)

        timeB = time()

        timeleft(a,L,timeB-timeA)

随着脚本的继续,它似乎变得越来越慢。以下是这些值的含义(并且随着增加而线性增加)。

  • L = 0a = 1:1.15分钟
  • L = 0a = 99:1.7分钟
  • L = 2a = 1:2.7分钟
  • L = 2a = 99:5.15分钟
  • L = 4a = 1:4.5分钟
  • L = 4a = 15:4.95分钟(这是它已达到的最新值)

为什么每次迭代需要更长的时间?循环的每次迭代实质上都会重置除了主全局列表之外的所有内容,每次都会添加主列表。但是,每个"期间内的循环"没有访问此主列表 - 他们每次都访问相同的本地列表。

编辑1:我会在这里发布模拟代码,以防有人想通过它,但我警告你,它相当长,变量名称可能不必要地混淆。

#########
a = 0.01
L = 0
total = 1000
sim = 500
inv_cost = 1
bl_cost = 4
#########

# Functions

import random
from time import time
time0 = time()

# function to report ETA etc.

def timeleft(a,L,period_time):
    if L==0:
        periods_left = ((1-a)*100)-1+2*99
    if L==2:
        periods_left = ((1-a)*100)-1+99
    if L==4:
        periods_left = ((1-a)*100)-1+0*99

    minute_time = period_time/60

    minutes_left = (periods_left*period_time)/60
    hours_left = (periods_left*period_time)/3600
    percentage_complete = 100*((297-periods_left)/297)

    print("Time for last period = ","%.2f" % minute_time," minutes")

    print("%.2f" % percentage_complete,"% complete")
    if hours_left<1:
        print("%.2f" % minutes_left," minutes left")
    else:
        print("%.2f" % hours_left," hours left")
    print("")
    return

def dcopy(inList):
    if isinstance(inList, list):
        return list( map(dcopy, inList) )
    return inList

# Save values to .CSV file

def CSV(a,L,I_STD_1,I_STD_2,I_STD_3,I_STD_4,O_STD_0,
        O_STD_1,O_STD_2,O_STD_3,O_STD_4):

    pass

# Initialization

# These are the global, master lists of data

I_STD_1 = [[0],[0],[0]]
I_STD_2 = [[0],[0],[0]]
I_STD_3 = [[0],[0],[0]]
I_STD_4 = [[0],[0],[0]]

O_STD_0 = [[0],[0],[0]]
O_STD_1 = [[0],[0],[0]]
O_STD_2 = [[0],[0],[0]]
O_STD_3 = [[0],[0],[0]]
O_STD_4 = [[0],[0],[0]]

for L in range(0,6,2):

    # These are local lists that are appended to at the end of every period

    I_STD_1_L = []
    I_STD_2_L = []
    I_STD_3_L = []
    I_STD_4_L = []

    O_STD_0_L = []
    O_STD_1_L = []
    O_STD_2_L = []
    O_STD_3_L = []
    O_STD_4_L = []

    test = []

    for n in range(1,100):          # THIS is the start of the 99 value loop

        a = n/100

        print ("L=",L,", alpha=",a)

        # Initialization for each Period

        F_1 = [0,10]            # Forecast
        F_2 = [0,10]
        F_3 = [0,10]
        F_4 = [0,10]

        R_0 = [10]              # Items Received
        R_1 = [10]
        R_2 = [10]
        R_3 = [10]
        R_4 = [10]

        for i in range(L):
            R_1.append(10)
            R_2.append(10)
            R_3.append(10)
            R_4.append(10)

        I_1 = [10]              # Final Inventory
        I_2 = [10]
        I_3 = [10]
        I_4 = [10]

        IP_1 = [10+10*L]        # Inventory Position
        IP_2 = [10+10*L]
        IP_3 = [10+10*L]
        IP_4 = [10+10*L]

        O_1 = [10]              # Items Ordered
        O_2 = [10]
        O_3 = [10]
        O_4 = [10]

        BL_1 = [0]              # Backlog
        BL_2 = [0]
        BL_3 = [0]
        BL_4 = [0]

        OH_1 = [20]             # Items on Hand
        OH_2 = [20]
        OH_3 = [20]
        OH_4 = [20]

        OR_1 = [10]             # Order received from customer
        OR_2 = [10]
        OR_3 = [10]
        OR_4 = [10]

        Db_1 = [10]             # Running Average Demand
        Db_2 = [10]
        Db_3 = [10]
        Db_4 = [10]

        var_1 = [0]             # Running Variance in Demand
        var_2 = [0]
        var_3 = [0]
        var_4 = [0]

        B_1 = [IP_1[0]+10]      # Optimal Basestock
        B_2 = [IP_2[0]+10]
        B_3 = [IP_3[0]+10]
        B_4 = [IP_4[0]+10]

        D = [0,10]              # End constomer demand

        for i in range(total+1):
            D.append(9)
            D.append(12)
            D.append(8)
            D.append(11)

        period = [0]

        from time import time
        timeA = time()

        # 1000 time periods t

        for t in range(1,total+1):

            period.append(t)


            #### MANUFACTURER ####

            # Manufacturing order from previous time period put into production
            R_4.append(O_4[t-1])

            #recieve shipment from supplier, calculate items OH HAND
            if I_4[t-1]<0:
                OH_4.append(R_4[t])
            else:
                OH_4.append(I_4[t-1]+R_4[t])

            # Recieve and dispatch order, update Inventory and Backlog for time t

            if (O_3[t-1] + BL_4[t-1]) <= OH_4[t]:               # No Backlog
                I_4.append(OH_4[t] - (O_3[t-1] + BL_4[t-1]))
                BL_4.append(0)
                R_3.append(O_3[t-1]+BL_4[t-1])
            else:
                I_4.append(OH_4[t] - (O_3[t-1] + BL_4[t-1]))    # Backlogged
                BL_4.append(-I_4[t])
                R_3.append(OH_4[t])

            # Update Inventory Position
            IP_4.append(IP_4[t-1] + O_4[t-1] - O_3[t-1])

            # Use exponential smoothing to forecast future demand
            future_demand = (1-a)*F_4[t] + a*O_3[t-1]
            F_4.append(future_demand)

            # Calculate D_bar(t) and Var(t)
            Db_4.append((1/t)*sum(O_3[0:t]))
            s = 0
            for i in range(0,t):
                s+=(O_3[i]-Db_4[t])**2

            if t==1:
                var_4.append(0)                                 # var(1) = 0
            else:
                var_4.append((1/(t-1))*s)

            # Simulation to determine B(t)
            S_BC_4 = [10000000000]*10
            Run_4 = [0]*10
            for B in range(10,500):

                S_OH_4 = OH_4[:]
                S_I_4 = I_4[:]
                S_R_4 = R_4[:]
                S_BL_4 = BL_4[:]
                S_IP_4 = IP_4[:]
                S_O_4 = O_4[:]

                # Update O(t)(the period just before the simulation begins)
                # using the B value for the simulation
                if B - S_IP_4[t] > 0:              
                    S_O_4.append(B - S_IP_4[t])
                else:
                    S_O_4.append(0)

                c = 0

                for i in range(t+1,t+sim+1):

                    S_R_4.append(S_O_4[i-1])

                    #simulate demand
                    demand = -1
                    while demand <0:
                        demand = random.normalvariate(F_4[t+1],(var_4[t])**(.5))

                    # Receive simulated shipment, calculate simulated items on hand

                    if S_I_4[i-1]<0:
                        S_OH_4.append(S_R_4[i])
                    else:
                        S_OH_4.append(S_I_4[i-1]+S_R_4[i])

                    # Receive and send order, update Inventory and Backlog (simulated)

                    owed = (demand + S_BL_4[i-1])
                    S_I_4.append(S_OH_4[i] - owed)
                    if owed <= S_OH_4[i]:                               # No Backlog
                        S_BL_4.append(0)
                        c += inv_cost*S_I_4[i]
                    else:
                        S_BL_4.append(-S_I_4[i])                        # Backlogged
                        c += bl_cost*S_BL_4[i]

                    # Update Inventory Position
                    S_IP_4.append(S_IP_4[i-1] + S_O_4[i-1] - demand)

                    # Update Order, Upstream member dispatches goods
                    if (B-S_IP_4[i]) > 0:
                        S_O_4.append(B - S_IP_4[i])
                    else:
                        S_O_4.append(0)

                # Log Simulation costs for that B-value
                S_BC_4.append(c)

                # If the simulated costs are increasing, stop
                if B>11:
                    dummy = []

                    for i in range(0,10):
                        dummy.append(S_BC_4[B-i]-S_BC_4[B-i-1])
                    Run_4.append(sum(dummy)/float(len(dummy)))

                    if Run_4[B-3] > 0 and B>20:
                        break
                else:
                    Run_4.append(0)

            # Use minimum cost as new B(t)
            var = min((val, idx) for (idx, val) in enumerate(S_BC_4))
            optimal_B = var[1]
            B_4.append(optimal_B)

            # Calculate O(t)
            if B_4[t] - IP_4[t] > 0:
                O_4.append(B_4[t] - IP_4[t])
            else:
                O_4.append(0)




            #### DISTRIBUTOR ####

            #recieve shipment from supplier, calculate items OH HAND
            if I_3[t-1]<0:
                OH_3.append(R_3[t])
            else:
                OH_3.append(I_3[t-1]+R_3[t])

            # Recieve and dispatch order, update Inventory and Backlog for time t

            if (O_2[t-1] + BL_3[t-1]) <= OH_3[t]:               # No Backlog
                I_3.append(OH_3[t] - (O_2[t-1] + BL_3[t-1]))
                BL_3.append(0)
                R_2.append(O_2[t-1]+BL_3[t-1])
            else:
                I_3.append(OH_3[t] - (O_2[t-1] + BL_3[t-1]))    # Backlogged
                BL_3.append(-I_3[t])
                R_2.append(OH_3[t])

            # Update Inventory Position
            IP_3.append(IP_3[t-1] + O_3[t-1] - O_2[t-1])

            # Use exponential smoothing to forecast future demand
            future_demand = (1-a)*F_3[t] + a*O_2[t-1]
            F_3.append(future_demand)

            # Calculate D_bar(t) and Var(t)
            Db_3.append((1/t)*sum(O_2[0:t]))
            s = 0
            for i in range(0,t):
                s+=(O_2[i]-Db_3[t])**2

            if t==1:
                var_3.append(0)                                 # var(1) = 0
            else:
                var_3.append((1/(t-1))*s)

            # Simulation to determine B(t)
            S_BC_3 = [10000000000]*10
            Run_3 = [0]*10

            for B in range(10,500):
                S_OH_3 = OH_3[:]
                S_I_3 = I_3[:]
                S_R_3 = R_3[:]
                S_BL_3 = BL_3[:]
                S_IP_3 = IP_3[:]
                S_O_3 = O_3[:]

                # Update O(t)(the period just before the simulation begins)
                # using the B value for the simulation
                if B - S_IP_3[t] > 0:              
                    S_O_3.append(B - S_IP_3[t])
                else:
                    S_O_3.append(0)
                c = 0
                for i in range(t+1,t+sim+1):

                    #simulate demand
                    demand = -1
                    while demand <0:
                        demand = random.normalvariate(F_3[t+1],(var_3[t])**(.5))

                    S_R_3.append(S_O_3[i-1])

                    # Receive simulated shipment, calculate simulated items on hand
                    if S_I_3[i-1]<0:
                        S_OH_3.append(S_R_3[i])
                    else:
                        S_OH_3.append(S_I_3[i-1]+S_R_3[i])

                    # Receive and send order, update Inventory and Backlog (simulated)
                    owed = (demand + S_BL_3[i-1])
                    S_I_3.append(S_OH_3[i] - owed)
                    if owed <= S_OH_3[i]:                               # No Backlog
                        S_BL_3.append(0)
                        c += inv_cost*S_I_3[i]
                    else:
                        S_BL_3.append(-S_I_3[i])                        # Backlogged
                        c += bl_cost*S_BL_3[i]

                    # Update Inventory Position
                    S_IP_3.append(S_IP_3[i-1] + S_O_3[i-1] - demand)

                    # Update Order, Upstream member dispatches goods
                    if (B-S_IP_3[i]) > 0:
                        S_O_3.append(B - S_IP_3[i])
                    else:
                        S_O_3.append(0)

                # Log Simulation costs for that B-value
                S_BC_3.append(c)

                # If the simulated costs are increasing, stop
                if B>11:
                    dummy = []

                    for i in range(0,10):
                        dummy.append(S_BC_3[B-i]-S_BC_3[B-i-1])
                    Run_3.append(sum(dummy)/float(len(dummy)))

                    if Run_3[B-3] > 0 and B>20:
                        break
                else:
                    Run_3.append(0)

            # Use minimum cost as new B(t)
            var = min((val, idx) for (idx, val) in enumerate(S_BC_3))
            optimal_B = var[1]
            B_3.append(optimal_B)

            # Calculate O(t)
            if B_3[t] - IP_3[t] > 0:
                O_3.append(B_3[t] - IP_3[t])
            else:
                O_3.append(0)



            #### WHOLESALER ####

            #recieve shipment from supplier, calculate items OH HAND
            if I_2[t-1]<0:
                OH_2.append(R_2[t])
            else:
                OH_2.append(I_2[t-1]+R_2[t])

            # Recieve and dispatch order, update Inventory and Backlog for time t

            if (O_1[t-1] + BL_2[t-1]) <= OH_2[t]:               # No Backlog
                I_2.append(OH_2[t] - (O_1[t-1] + BL_2[t-1]))
                BL_2.append(0)
                R_1.append(O_1[t-1]+BL_2[t-1])

            else:
                I_2.append(OH_2[t] - (O_1[t-1] + BL_2[t-1]))    # Backlogged
                BL_2.append(-I_2[t])
                R_1.append(OH_2[t])

            # Update Inventory Position
            IP_2.append(IP_2[t-1] + O_2[t-1] - O_1[t-1])

            # Use exponential smoothing to forecast future demand
            future_demand = (1-a)*F_2[t] + a*O_1[t-1]
            F_2.append(future_demand)

            # Calculate D_bar(t) and Var(t)
            Db_2.append((1/t)*sum(O_1[0:t]))
            s = 0
            for i in range(0,t):
                s+=(O_1[i]-Db_2[t])**2

            if t==1:
                var_2.append(0)                                 # var(1) = 0
            else:
                var_2.append((1/(t-1))*s)

            # Simulation to determine B(t)
            S_BC_2 = [10000000000]*10
            Run_2 = [0]*10

            for B in range(10,500):
                S_OH_2 = OH_2[:]
                S_I_2 = I_2[:]
                S_R_2 = R_2[:]
                S_BL_2 = BL_2[:]
                S_IP_2 = IP_2[:]
                S_O_2 = O_2[:]

                # Update O(t)(the period just before the simulation begins)
                # using the B value for the simulation
                if B - S_IP_2[t] > 0:              
                    S_O_2.append(B - S_IP_2[t])
                else:
                    S_O_2.append(0)
                c = 0

                for i in range(t+1,t+sim+1):

                    #simulate demand
                    demand = -1
                    while demand <0:
                        demand = random.normalvariate(F_2[t+1],(var_2[t])**(.5))

                    # Receive simulated shipment, calculate simulated items on hand
                    S_R_2.append(S_O_2[i-1])

                    if S_I_2[i-1]<0:
                        S_OH_2.append(S_R_2[i])
                    else:
                        S_OH_2.append(S_I_2[i-1]+S_R_2[i])

                    # Receive and send order, update Inventory and Backlog (simulated)

                    owed = (demand + S_BL_2[i-1])
                    S_I_2.append(S_OH_2[i] - owed)
                    if owed <= S_OH_2[i]:                               # No Backlog
                        S_BL_2.append(0)
                        c += inv_cost*S_I_2[i]
                    else:
                        S_BL_2.append(-S_I_2[i])                        # Backlogged
                        c += bl_cost*S_BL_2[i]

                    # Update Inventory Position
                    S_IP_2.append(S_IP_2[i-1] + S_O_2[i-1] - demand)

                    # Update Order, Upstream member dispatches goods
                    if (B-S_IP_2[i]) > 0:
                        S_O_2.append(B - S_IP_2[i])
                    else:
                        S_O_2.append(0)

                # Log Simulation costs for that B-value
                S_BC_2.append(c)

                # If the simulated costs are increasing, stop
                if B>11:
                    dummy = []
                    for i in range(0,10):
                        dummy.append(S_BC_2[B-i]-S_BC_2[B-i-1])
                    Run_2.append(sum(dummy)/float(len(dummy)))

                    if Run_2[B-3] > 0 and B>20:
                        break
                else:
                    Run_2.append(0)

            # Use minimum cost as new B(t)
            var = min((val, idx) for (idx, val) in enumerate(S_BC_2))
            optimal_B = var[1]
            B_2.append(optimal_B)

            # Calculate O(t)
            if B_2[t] - IP_2[t] > 0:
                O_2.append(B_2[t] - IP_2[t])
            else:
                O_2.append(0)





            #### RETAILER ####

            #recieve shipment from supplier, calculate items OH HAND
            if I_1[t-1]<0:
                OH_1.append(R_1[t])
            else:
                OH_1.append(I_1[t-1]+R_1[t])

            # Recieve and dispatch order, update Inventory and Backlog for time t

            if (D[t] +BL_1[t-1]) <= OH_1[t]:              # No Backlog
                I_1.append(OH_1[t] - (D[t] + BL_1[t-1]))
                BL_1.append(0)
                R_0.append(D[t]+BL_1[t-1])
            else:
                I_1.append(OH_1[t] - (D[t] + BL_1[t-1]))  # Backlogged
                BL_1.append(-I_1[t])
                R_0.append(OH_1[t])

            # Update Inventory Position
            IP_1.append(IP_1[t-1] + O_1[t-1] - D[t])

            # Use exponential smoothing to forecast future demand
            future_demand = (1-a)*F_1[t] + a*D[t]
            F_1.append(future_demand)

            # Calculate D_bar(t) and Var(t)
            Db_1.append((1/t)*sum(D[1:t+1]))
            s = 0
            for i in range(1,t+1):
                s+=(D[i]-Db_1[t])**2

            if t==1:                                            # Var(1) = 0
                var_1.append(0)
            else:
                var_1.append((1/(t-1))*s)

            # Simulation to determine B(t)
            S_BC_1 = [10000000000]*10
            Run_1 = [0]*10
            for B in range(10,500):
                S_OH_1 = OH_1[:]
                S_I_1 = I_1[:]
                S_R_1 = R_1[:]
                S_BL_1 = BL_1[:]
                S_IP_1 = IP_1[:]
                S_O_1 = O_1[:]

                # Update O(t)(the period just before the simulation begins)
                # using the B value for the simulation
                if B - S_IP_1[t] > 0:              
                    S_O_1.append(B - S_IP_1[t])
                else:
                    S_O_1.append(0)

                c=0
                for i in range(t+1,t+sim+1):

                    #simulate demand
                    demand = -1
                    while demand <0:
                        demand = random.normalvariate(F_1[t+1],(var_1[t])**(.5))

                    S_R_1.append(S_O_1[i-1])

                    # Receive simulated shipment, calculate simulated items on hand
                    if S_I_1[i-1]<0:
                        S_OH_1.append(S_R_1[i])
                    else:
                        S_OH_1.append(S_I_1[i-1]+S_R_1[i])

                    # Receive and send order, update Inventory and Backlog (simulated)
                    owed = (demand + S_BL_1[i-1])
                    S_I_1.append(S_OH_1[i] - owed)
                    if owed <= S_OH_1[i]:                               # No Backlog
                        S_BL_1.append(0)
                        c += inv_cost*S_I_1[i]
                    else:
                        S_BL_1.append(-S_I_1[i])                        # Backlogged
                        c += bl_cost*S_BL_1[i]

                    # Update Inventory Position
                    S_IP_1.append(S_IP_1[i-1] + S_O_1[i-1] - demand)

                    # Update Order, Upstream member dispatches goods
                    if (B-S_IP_1[i]) > 0:
                        S_O_1.append(B - S_IP_1[i])
                    else:
                        S_O_1.append(0)

                # Log Simulation costs for that B-value
                S_BC_1.append(c)

                # If the simulated costs are increasing, stop
                if B>11:
                    dummy = []
                    for i in range(0,10):
                        dummy.append(S_BC_1[B-i]-S_BC_1[B-i-1])
                    Run_1.append(sum(dummy)/float(len(dummy)))

                    if Run_1[B-3] > 0 and B>20:
                        break
                else:
                    Run_1.append(0)

            # Use minimum as your new B(t)
            var = min((val, idx) for (idx, val) in enumerate(S_BC_1))
            optimal_B = var[1]
            B_1.append(optimal_B)

            # Calculate O(t)
            if B_1[t] - IP_1[t] > 0:
                O_1.append(B_1[t] - IP_1[t])
            else:
                O_1.append(0)


        ### Calculate the Standard Devation of the last half of time periods ###

        def STD(numbers):
            k = len(numbers)
            mean = sum(numbers) / k
            SD = (sum([dev*dev for dev in [x-mean for x in numbers]])/(k-1))**.5
            return SD

        start = (total//2)+1

        # Only use the last half of the time periods to calculate the standard deviation

        I_STD_1_L.append(STD(I_1[start:]))
        I_STD_2_L.append(STD(I_2[start:]))
        I_STD_3_L.append(STD(I_3[start:]))
        I_STD_4_L.append(STD(I_4[start:]))

        O_STD_0_L.append(STD(D[start:]))
        O_STD_1_L.append(STD(O_1[start:]))
        O_STD_2_L.append(STD(O_2[start:]))
        O_STD_3_L.append(STD(O_3[start:]))
        O_STD_4_L.append(STD(O_4[start:]))

        from time import time
        timeB = time()

        timeleft(a,L,timeB-timeA)

        I_STD_1[L//2] = I_STD_1_L[:]
        I_STD_2[L//2] = I_STD_2_L[:]
        I_STD_3[L//2] = I_STD_3_L[:]
        I_STD_4[L//2] = I_STD_4_L[:]

        O_STD_0[L//2] = O_STD_0_L[:]
        O_STD_1[L//2] = O_STD_1_L[:]
        O_STD_2[L//2] = O_STD_2_L[:]
        O_STD_3[L//2] = O_STD_3_L[:]
        O_STD_4[L//2] = O_STD_4_L[:]

        CSV(a,L,I_STD_1,I_STD_2,I_STD_3,I_STD_4,O_STD_0,
            O_STD_1,O_STD_2,O_STD_3,O_STD_4)


from time import time
timeE = time()

print("Run Time: ",(timeE-time0)/3600," hours")

2 个答案:

答案 0 :(得分:4)

这是观察profiler的好时机。您可以分析代码以确定花费的时间。看起来很可能是你在模拟代码中发出的,但是没有能够看到代码最好的帮助你可能会变得模糊。

根据添加的代码编辑:

你正在进行大量的列表复制,虽然价格不是很高,但却耗费了很多时间。

我同意您的代码可能会造成不必要的混淆,并会建议您清理代码。将令人困惑的名称更改为有意义的名称可能有助于您找到遇到问题的位置。

最后,情况可能是您的模拟计算成本很高。您可能需要考虑查看SciPy,Pandas或其他一些Python数学软件包,以获得更好的性能,也许还有更好的工具来表达您正在模拟的模型。

答案 1 :(得分:1)

我在编写的Python 3.x脚本中遇到了类似的问题。该脚本随机生成1,000,000(一百万)个JSON对象,将它们写入文件。

我的问题是程序随着时间的推移逐渐变慢。这是每10,000个对象的时间戳跟踪:

So far: Mar23-17:56:46:      0
So far: Mar23-17:56:48:  10000   ( 2 seconds)
So far: Mar23-17:56:50:  20000   ( 2 seconds)
So far: Mar23-17:56:55:  30000   ( 5 seconds)
So far: Mar23-17:57:01:  40000   ( 6 seconds)
So far: Mar23-17:57:09:  50000   ( 8 seconds)
So far: Mar23-17:57:18:  60000   ( 8 seconds)
So far: Mar23-17:57:29:  70000   (11 seconds)
So far: Mar23-17:57:42:  80000   (13 seconds)
So far: Mar23-17:57:56:  90000   (14 seconds)
So far: Mar23-17:58:13: 100000   (17 seconds)
So far: Mar23-17:58:30: 110000   (17 seconds)
So far: Mar23-17:58:51: 120000   (21 seconds)
So far: Mar23-17:59:12: 130000   (21 seconds)
So far: Mar23-17:59:35: 140000   (23 seconds)

可以看出,脚本会逐渐变长,以生成10,000条记录的组。

就我而言,结果是我生成唯一ID号的方式,每个ID号在10250000000000-10350000000000范围内。为避免重复生成相同的ID两次,我将新生成的ID存储在列表中,稍后检查列表中不存在该ID:

trekIdList = []

def GenerateRandomTrek ():
    global trekIdList

    while True:
        r = random.randint (10250000000000, 10350000000000)
        if not r in trekIdList:
            trekIdList.append (r)
            return r

问题是未排序的列表需要O(n)才能搜索。当新生成的ID附加到列表中时,遍历/搜索列表所需的时间会增加。

解决方案是切换到字典(或地图):

trekIdList = {}
. . .
def GenerateRandomTrek ():
    global trekIdList

    while True:
        r = random.randint (10250000000000, 10350000000000)
        if not r in trekIdList:
            trekIdList [r] = 1
            return r

立即改善:

So far: Mar23-18:11:30:      0
So far: Mar23-18:11:30:  10000
So far: Mar23-18:11:31:  20000
So far: Mar23-18:11:31:  30000
So far: Mar23-18:11:31:  40000
So far: Mar23-18:11:32:  50000
So far: Mar23-18:11:32:  60000
So far: Mar23-18:11:32:  70000
So far: Mar23-18:11:33:  80000
So far: Mar23-18:11:33:  90000
So far: Mar23-18:11:33: 100000
So far: Mar23-18:11:34: 110000
So far: Mar23-18:11:34: 120000
So far: Mar23-18:11:34: 130000
So far: Mar23-18:11:35: 140000

原因是访问字典/ map / hash中的值是O(1)。

道德:在处理大量项目时,使用字典/地图或二进制搜索排序列表而不是无序列表。