根据某些条件用来自同一数据帧的其他日期时间值替换空日期时间值

时间:2021-06-28 09:27:11

标签: python pandas dataframe datetime

我有一个数据框:

                  login   Status               start
0   2021-05-28 09:29:35 Resolved                 NaT
1   2021-05-28 11:46:11   Closed                 NaT
2   2021-05-29 15:59:16      WIP                 NaT
3   2021-05-30 10:43:57   Closed 2021-05-31 12:53:57
4   2021-06-27 17:53:29 Resolved                 NaT

如果 start 为 start 且 Status 为 Resolved 或 Closed,我想用 login 值填充 NULL 值。
预期数据帧:

                  login   Status               start
0   2021-05-28 09:29:35 Resolved 2021-05-28 09:29:35
1   2021-05-28 11:46:11   Closed 2021-05-28 11:46:11
2   2021-05-29 15:59:16      WIP                 NaT
3   2021-05-30 10:43:57   Closed 2021-05-31 12:53:57
4   2021-06-27 17:53:29 Resolved 2021-06-27 17:53:29

我无法为 Null 个起始值设置条件

我试着做一个函数:

def fun(row):       
    if row.start.isna() and (row['Status'] == 'Resolved') or (row['Status' == 'Closed']):
        return row['start']
    else:
        return row['login']

然后使用 apply 运行函数:

df['start'] = df.apply(fun, axis=1)

但是我收到错误:

AttributeError: 'Timestamp' object has no attribute 'isna'

怎样才能得到上面的结果

TIA

2 个答案:

答案 0 :(得分:1)

在屏蔽 fill 中的值之后,我们可以 NaT start 中的 login 值,其中相应的 Status 不是 Closed, Resolved 之一

m = df['Status'].isin(['Resolved', 'Closed'])
df['start'] = df['start'].fillna(df['login'].mask(~m))

  login                Status   start
0 2021-05-28 09:29:35  Resolved 2021-05-28 09:29:35
1 2021-05-28 11:46:11    Closed 2021-05-28 11:46:11
2 2021-05-29 15:59:16       WIP                 NaT
3 2021-05-30 10:43:57    Closed 2021-05-31 12:53:57
4 2021-06-27 17:53:29  Resolved 2021-06-27 17:53:29

答案 1 :(得分:0)

简单的 fillna().loc[]

df = pd.read_csv(io.StringIO("""                  login   Status               start
0   2021-05-28 09:29:35  Resolved                 NaT
1   2021-05-28 11:46:11   Closed                 NaT
2   2021-05-29 15:59:16      WIP                 NaT
3   2021-05-30 10:43:57   Closed  2021-05-31 12:53:57
4   2021-06-27 17:53:29  Resolved                 NaT"""), sep="\s\s+", engine="python")
df["login"] = pd.to_datetime(df["login"])
df["start"] = pd.to_datetime(df["start"])

df.loc[~df["Status"].eq("WIP"),"start"] = df.loc[~df["Status"].eq("WIP"),"start"].fillna(df["login"])

df
<头>
登录 状态 开始
0 2021-05-28 09:29:35 已解决 2021-05-28 09:29:35
1 2021-05-28 11:46:11 关闭 2021-05-28 11:46:11
2 2021-05-29 15:59:16 WIP NaT
3 2021-05-30 10:43:57 关闭 2021-05-31 12:53:57
4 2021-06-27 17:53:29 已解决 2021-06-27 17:53:29
相关问题