Question

我有两个数据框（df_1，df_2），一些变量（A，B，C），一个函数（fun）和一个全局遗传优化器，可以找到给定范围A，B的最大乐趣， C。

from scipy.optimize import differential_evolution

df_1 = pd.DataFrame({'O' : [1,2,3], 'M' : [2,8,3]})

df_2 = pd.DataFrame({'O' : [1,1,1, 2,2,2, 3,3,3],
                     'M' : [9,2,4, 6,7,8, 5,3,4],
                     'X' : [2,4,6, 4,8,7, 3,1,9],
                     'Y' : [3,6,1, 4,6,5, 1,0,7],
                     'Z' : [2,4,8, 3,5,4, 7,5,1]})

# Index
df_1 = df_1.set_index('O')
df_1_M = df_1.M
df_1_M = df_1_M.sort_index()

# Fun
def fun(z, *params):
    A,B,C = z
        
    # Score
    df_2['S'] = df_2['X']*A + df_2['Y']*B + df_2['Z']*C
    
    # Top score
    df_Sort = df_2.sort_values(['S', 'X', 'M'], ascending=[False, True, True])
    df_O    = df_Sort.set_index('O')
    M_Top   = df_O[~df_O.index.duplicated(keep='first')].M
    M_Top   = M_Top.sort_index()
        
    # Compare the top scoring row for each O to df_1
    df_1_R = df_1_M.reindex(M_Top.index) # Nan
    T_N_T  = M_Top == df_1_R

    # Record the results for the given values of A,B,C
    df_Res = pd.DataFrame({'it_is':T_N_T}) # is this row of df_1 the same as this row of M_Top?
        
    # p_hat =         TP / (TP + FP)
    p_hat = df_Res.sum() / len(df_Res.index)
    
    print(z)
        
    return -p_hat[0]

# Bounds
min_ = 0
max_ = 1
ran_ge = (min_, max_)
bounds = [ran_ge,ran_ge,ran_ge]

# Params
params = (df_1, df_2)

# DE
DE = differential_evolution(fun, bounds, args=params)

它在每次迭代中打印出[A B C]，例如最后三行是：

[0.04003901 0.50504249 0.56332845]
[0.040039   0.5050425  0.56332845]
[0.040039   0.50504249 0.56332846]

要查看其收敛情况，请问我如何绘制 A，B，C以防止迭代？

我试图将A，B，C存储在：

df_P = pd.DataFrame({0})

在增添乐趣的同时：

df_P.append(z)

但是我得到了：

RuntimeError: The map-like callable must be of the form f(func, iterable), returning a sequence of numbers the same length as 'iterable'

Answer 1

因此，我不确定是否找到了最好的方法，但我找到了一种。它使用列表通过引用传递的事实。这意味着如果将列表传递给函数并对其进行修改，即使该函数未返回该列表，它也会在程序的其余部分被修改。

# Params
results = []  # this list will hold our restuts
params = (df_1, df_2, results)  # add it to the params of the functions

# now in the function add the output to the list, Instead of the mean here I used the distance to the origin (as if you 3 value were a 3d vector) 

p_hat = df_Res.sum() / len(df_Res.index)

distance_to_zeros = sum([e**2 for e in z]) ** 1/2
results.append(distance_to_zeros)
# Indeed you can also append z directly.

# Then after DE call
DE = differential_evolution(fun, bounds, args=params)

x = range(0, len(results))

plt.scatter(x, results, alpha=0.5)
plt.show()

Answer 2

使用RomainL的代码绘制A，B，C：

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import differential_evolution

df_1 = pd.DataFrame({'O' : [1,2,3], 'M' : [2,8,3]})

df_2 = pd.DataFrame({'O' : [1,1,1, 2,2,2, 3,3,3],
                     'M' : [9,2,4, 6,7,8, 5,3,4],
                     'X' : [2,4,6, 4,8,7, 3,1,9],
                     'Y' : [3,6,1, 4,6,5, 1,0,7],
                     'Z' : [2,4,8, 3,5,4, 7,5,1]})

# Index
df_1 = df_1.set_index('O')
df_1_M = df_1.M
df_1_M = df_1_M.sort_index()

# Fun
def fun(z, *params):
    A,B,C = z
        
    # Score
    df_2['S'] = df_2['X']*A + df_2['Y']*B + df_2['Z']*C
    
    # Top score
    df_Sort = df_2.sort_values(['S', 'X', 'M'], ascending=[False, True, True])
    df_O    = df_Sort.set_index('O')
    M_Top   = df_O[~df_O.index.duplicated(keep='first')].M
    M_Top   = M_Top.sort_index()
        
    # Compare the top scoring row for each O to df_1
    df_1_R = df_1_M.reindex(M_Top.index) # Nan
    T_N_T  = M_Top == df_1_R

    # Record the results for the given values of A,B,C
    df_Res = pd.DataFrame({'it_is':T_N_T}) # is this row of df_1 the same as this row of M_Top?
        
    # p_hat =         TP / (TP + FP)
    p_hat = df_Res.sum() / len(df_Res.index)
    
    results.append(z)
        
    return -p_hat[0]

# Bounds
min_ = 0
max_ = 1
ran_ge = (min_, max_)
bounds = [ran_ge,ran_ge,ran_ge]

# Params
results = []
params = (df_1, df_2, results)

# DE
DE = differential_evolution(fun, bounds, args=params)

# Plot
df_R = pd.DataFrame(list(map(np.ravel, results)), columns=('A','B','C'))
plt.figure()
ax = df_R.plot()
ax.set_ylim(min_,max_)

绘制scipy.optimize.differential_evolution的收敛结果

2 个答案: