我现在已经搜索了一段时间,而且除了硬编码之外,我无法用其他方式弄清楚如何做到这一点。 在csv文件中,我必须获得三个系列的平均值的最大值,然后返回该系列的NAME,这是导致我麻烦的部分。
comp = max(DataTD['Cycle (seconds)'].mean(), DataTD['Run (seconds)'].mean(), DataTD['Swim (seconds)'].mean())
if comp == DataTD['Cycle (seconds)'].mean():
print(DataTD['Cycle (seconds)'].name)
elif comp == DataTD['Run (seconds)'].mean():
print(DataTD['Run (seconds)'].name)
elif comp == DataTD['Swim (seconds)'].mean():
print(DataTD['Swim (seconds)'].name)
答案 0 :(得分:1)
这样的事情应该有效(避免测试):
datas = [DataTD['Cycle (seconds)'],
DataTD['Run (seconds)'],
DataTD['Swim (seconds)']]
means = [data.mean() for data in datas]
max_mean_idx = np.argmax(means)
print(datas[max_mean_idx].name)
答案 1 :(得分:1)
您可以获取最大列名称,然后根据该值获取平均值。
cols = ['Cycle (seconds)', 'Run (seconds)', 'Swim (seconds)']
max_col = max(cols, key=lambda col: DataTD[col].mean())
print('Column name: ' + max_col)
print('Mean: ' + str(DataTD[max_col].mean()))
答案 2 :(得分:1)
考虑样本数据
cols = ['Cycle (seconds)', 'Run (seconds)', 'Swim (seconds)']
np.random.seed([3, 1415])
DataTD = pd.DataFrame(
np.random.randint(10, size=(10, 3)),
list('ABCDEFGHIJ'), cols
)
Cycle (seconds) Run (seconds) Swim (seconds)
A 0 2 7
B 3 8 7
C 0 6 8
D 6 0 2
E 0 4 9
F 7 3 2
G 4 3 3
H 6 7 7
I 4 5 3
J 7 5 9
IIUC:
使用带有参数mean
的{{1}}后跟axis=1
来查找最大值的索引。
idxmax
否则,请尝试
DataTD.loc[[DataTD[cols].mean(1).idxmax()]]
Cycle (seconds) Run (seconds) Swim (seconds)
J 7 5 9
答案 3 :(得分:1)
来自Pir的数据
DataTD.loc[DataTD[cols].mean(1).sort_values().iloc[[-1]].index]
Out[625]:
Cycle (seconds) Run (seconds) Swim (seconds)
J 7 5 9