在Python中找到第二和第三高的值

时间:2018-03-07 13:46:39

标签: python python-3.x pandas

我的食物项目数据框及其营养成分如下:

import pandas as pd
import os
os.chdir('D:\\userdata\\adbharga\\Desktop\\AVA\\RTestCode\\Python')
data=pd.read_csv("nutrient.csv")
data.head()

缺货[30]:

Name  Calories  Fat  Carb  Fiber  Protein
0      Chonga Bagel       300    5    50      3       12
1      8-Grain Roll       380    6    70      7       10
2  Almond Croissant       410   22    45      3       10
3     Apple Fritter       460   23    56      2        7
4  Banana Nut Bread       420   22    52      2        6

需要提取Top Nutrient含量及其价值。对于下面使用的代码。

data['Top Nutrient'] = data[['Calories','Fat','Carb','Fiber','Protein']].idxmax(axis=1)
data['Amount']= data[['Calories','Fat','Carb','Fiber','Protein']].max(axis=1)
data.head()

出[33]:

Name  Calories  Fat  Carb  Fiber  Protein Top Nutrient  Amount
0      Chonga Bagel       300    5    50      3       12     Calories     300
1      8-Grain Roll       380    6    70      7       10     Calories     380
2  Almond Croissant       410   22    45      3       10     Calories     410
3     Apple Fritter       460   23    56      2        7     Calories     460
4  Banana Nut Bread       420   22    52      2        6     Calories     420

有没有办法显示下2个Top Nutrient及其Value.Expected Output将是这样的:

Name    NextTop2   NextTop2Amount
Chonga Bagel        Carb|Protein    50|12
8-Grain Roll        Carb|Protein    70|10
Almond Croissant    Carb|Fat        45|22
Apple Fritter       Carb|Fat        56|23
Banana Nut Bread    Carb|Fat        52|22

由于

1 个答案:

答案 0 :(得分:1)

这是最好用numpy.argsort,因为速度非常快。

首先按argsort - cols = ['Calories','Fat','Carb','Fiber','Protein'] arr = data[cols].values.argsort(axis=1)[:, [-2, -3]] a = np.array(cols)[arr] print (a) [['Carb' 'Protein'] ['Carb' 'Protein'] ['Carb' 'Fat'] ['Carb' 'Fat'] ['Carb' 'Fat']] 过滤列,并按b = data[cols].values[np.arange(len(arr))[:,None], arr] print (b) [[50 12] [70 10] [45 22] [56 23] [52 22]] 获取2和3的索引。顶部:

DataFrame

还可以通过索引选择值:

|

上次创建data['Top Nutrient'] = data[cols].idxmax(axis=1) data['Amount']= data[cols].max(axis=1) data['NextTop2'] = pd.DataFrame(a).apply('|'.join, 1) data['NextTop2Amount'] = pd.DataFrame(b).astype(str).apply('|'.join, 1) 并按print (data) Name Calories Fat Carb Fiber Protein Top Nutrient Amount \ 0 Chonga Bagel 300 5 50 3 12 Calories 300 1 8-Grain Roll 380 6 70 7 10 Calories 380 2 Almond Croissant 410 22 45 3 10 Calories 410 3 Apple Fritter 460 23 56 2 7 Calories 460 4 Banana Nut Bread 420 22 52 2 6 Calories 420 NextTop2 NextTop2Amount 0 Carb|Protein 50|12 1 Carb|Protein 70|10 2 Carb|Fat 45|22 3 Carb|Fat 56|23 4 Carb|Fat 52|22 为一列添加联接:

/public void click6(String ObjectName, ChromeDriver driver2) throws IOException   
driver2.findElement(By.xpath(prop.getProperty(ObjectName))).click();
Runtime.getRuntime().exec("C:\\Users\\Tester\\Documents\\nall\\SchoolERP test\\FileUpload.exe");
driver2.findElement(By.id("input_6")).sendKeys("AutoIT in Selenium");                   
driver2.findElement(By.xpath("//input[@id='schoollogo']")).click();
 If inStr(";" & Trim(str(r)) & ";") < 1 Then