Python Pandas根据另一个集合(集合)的成员资格选择行

时间:2016-09-21 14:45:03

标签: python pandas indexing set conditional-statements

假设我的DataFrame构造如下:

import pandas
import numpy

column_names = ["name", "age", "score"]
names = numpy.random.choice(["Jorge", "Xavier", "Joaquin", "Juan", "Jose"], 50)
ages = numpy.random.randint(0, 100, 50)
scores = numpy.random.rand(50)
df = pandas.DataFrame.from_dict(dict(zip(column_names, [names, ages, scores])))

以上DataFrame的前10行如下所示。

   age     name     score
0   15    Jorge  0.031380
1   44     Juan  0.373199
2   84   Xavier  0.999065
3   55     Juan  0.159873
4   55  Joaquin  0.211931
5   33     Juan  0.484350
6   22   Xavier  0.510276
7   86  Joaquin  0.490013
8    2     Jose  0.185086
9   51     Juan  0.979015

我希望能够选择name列的元素是{"Xavier", "Joaquin"}成员的行。我本能地想到像df.iloc[df["name"] in {"Xavier", "Joaquin"}, :]这样的东西,但那并不起作用。那么我该如何实现呢?

注意

我知道我可以通过

实现这个特定的例子
df.loc[numpy.logical_or(df["name"] == "Xavier", df["name"] == "Joaquin"), :]

但这不是重点。这只是我真正问题的简化示例。我的高度为DataFrame 2,340,923,名称设置为names,大小为3,624,我想选择名称为names名称成员的行。

1 个答案:

答案 0 :(得分:5)

我认为你需要isin

print (df.loc[df["name"].isin(["Xavier", "Joaquin"]), :])
    age     name     score
1    66  Joaquin  0.767056
2    17  Joaquin  0.721369
7    53  Joaquin  0.209415
10    9   Xavier  0.394815
13   20  Joaquin  0.276596
14   17   Xavier  0.810725
15   76   Xavier  0.918273
17   91  Joaquin  0.974723
18   39   Xavier  0.869607
21    3   Xavier  0.200578
22   34  Joaquin  0.938018
23   90   Xavier  0.664387
26   51   Xavier  0.946753
28   49   Xavier  0.859911
30   22  Joaquin  0.602381
34    7   Xavier  0.759837
35   96  Joaquin  0.790691
39   13  Joaquin  0.599557
40   10   Xavier  0.563933
41   69   Xavier  0.983787
43   58   Xavier  0.542903
44    8  Joaquin  0.307106
45   77  Joaquin  0.330278
46   55  Joaquin  0.980077
47   12   Xavier  0.177509
49   15  Joaquin  0.590958

它也适用于set

names = set(["Xavier", "Joaquin"])
print (df.loc[df["name"].isin(names), :])

    age     name     score
1    66  Joaquin  0.767056
2    17  Joaquin  0.721369
7    53  Joaquin  0.209415
10    9   Xavier  0.394815
13   20  Joaquin  0.276596
14   17   Xavier  0.810725
15   76   Xavier  0.918273
17   91  Joaquin  0.974723
18   39   Xavier  0.869607
21    3   Xavier  0.200578
22   34  Joaquin  0.938018
23   90   Xavier  0.664387
26   51   Xavier  0.946753
28   49   Xavier  0.859911
30   22  Joaquin  0.602381
34    7   Xavier  0.759837
35   96  Joaquin  0.790691
39   13  Joaquin  0.599557
40   10   Xavier  0.563933
41   69   Xavier  0.983787
43   58   Xavier  0.542903
44    8  Joaquin  0.307106
45   77  Joaquin  0.330278
46   55  Joaquin  0.980077
47   12   Xavier  0.177509
49   15  Joaquin  0.590958