我使用pandas 0.13.1 Python 2.7:
我在risk
列中有一些既不是Small
,Medium
或High
的值。我想删除值不是Small
,Medium
和High
的行。我尝试了以下方法:
df = df[(df.risk == "Small") | (df.risk == "Medium") | (df.risk == "High")]
但是这会返回一个空数据框。如何正确过滤它们?
答案 0 :(得分:5)
我想你想要:
df = df[(df.risk.isin(["Small","Medium","High"]))]
示例:
In [5]:
import pandas as pd
df = pd.DataFrame({'risk':['Small','High','Medium','Negligible', 'Very High']})
df
Out[5]:
risk
0 Small
1 High
2 Medium
3 Negligible
4 Very High
[5 rows x 1 columns]
In [6]:
df[df.risk.isin(['Small','Medium','High'])]
Out[6]:
risk
0 Small
1 High
2 Medium
[3 rows x 1 columns]
答案 1 :(得分:0)
另一种不错的可读方法是:
small_risk = df["risk"] == "Small"
medium_risk = df["risk"] == "Medium"
high_risk = df["risk"] == "High"
然后您可以像这样使用它:
df[small_risk | medium_risk | high_risk]
或
df[small_risk & medium_risk]