Question

我有这个pandas数据框，实际上是一个excel电子表格：

    Unnamed: 0  Date    Num     Company     Link    ID
0   NaN     1990-11-15  131231  apple...    http://www.example.com/201611141492/xellia...   290834
1   NaN     1990-10-22  1231    microsoft http://www.example.com/news/arnsno...     NaN
2   NaN     2011-10-20  123     apple   http://www.example.com/ator...  209384
3   NaN     2013-10-27  123     apple...    http://example.com/sections/th-shots/2016/...   098
4   NaN     1990-10-26  123     google  http://www.example.net/business/Drugmak...  098098
5   NaN     1990-10-18  1231    google...   http://example.com/news/va-rece...  NaN
6   NaN     2011-04-26  546     amazon...   http://www.example.com/news/home/20160425...    9809

我想删除NaN列中ID的所有行，并重新索引＆＃34;索引虚拟列＆＃34;：

    Unnamed: 0  Date    Num     Company     Link    ID
0   NaN     1990-11-15  131231  apple...    http://www.example.com/201611141492/xellia...   290834
1   NaN     2011-10-20  123     apple   http://www.example.com/ator...  209384
2   NaN     2013-10-27  123     apple...    http://example.com/sections/th-shots/2016/...   098
3   NaN     1990-10-26  123     google  http://www.example.net/business/Drugmak...  098098
4   NaN     2011-04-26  546     amazon...   http://www.example.com/news/home/20160425...    9809

我知道这可以按照以下方式完成：

df = df['ID'].dropna()

或者

df[df.ID != np.nan]

或者

df = df[np.isfinite(df['ID'])]

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

或者

df[df.ID()]

或者：

df[df.ID != '']

然后：

df.reset_index(drop=True, inplace=True)

但是，它没有删除NaN中的ID。我正在使用以前的数据帧。

更新

在：

df['ID'].values

输出：

array([ '....A lot of text....',
       nan,
       "A lot of text...",
       "More text",
       'text from the site',
       nan,
       "text from the site"], dtype=object)

Answer 1

试试df.dropna(axis = 1)。

或者，df.dropna(axis = 0, subset = "ID")看看它是否有帮助。

Answer 2

试试这个

df = df[df.ID != 'nan']

如何删除pandas数据框中的行？

2 个答案: