选择具有特定条件的行,然后用另一列中的某些值替换这些行

时间:2017-05-08 19:48:52

标签: python pandas parsing

如果'州'在'4'。关闭'& 'closeDate'在'2017/3/27'     然后使用'correctClosedDate'列中的日期值更改/替换/更新'closeDate'

enter image description here

if(!fork()) {
    close(fd[0]); //close read
    dup2(fd[1],1); //std output duplicated to pipe write
    close(fd2[1]); //close write
    dup2(fd2[0],0); //std input from father duplicated to pipe read
    //cut -d: -f2,4 -
    execlp("cut","cut","-d:",buffer,"-",NULL);
}
//father
close(fd[1]); //close write
close(fd2[0]); //close read
n = read(0,strin,PIPE_BUF);
write(fd2[1],strin,n);
close(fd2[1]);
//n = read(fd2[0],strin,PIPE_BUF); //read stdin from pipe
f = read(fd[0],buffer,PIPE_BUF); //stdout from cut

在这里,我尝试使用这些条件找到行,我不知道如何用'correctCloseDate'中的值替换这些行。

     df= pd.DataFrame(
                {"ID":['A','B','C','D','E'],
                 "state":['3. Cancelled', '4. Closed', '4. Closed', '3. Cancelled', '4. Closed' ],
                 "closeDate":['2017/4/12','2017/3/27','2017/4/1','2017/4/29','2017/3/27'],
                 "correctCloseDate":['', '2017/1/5', '', '', '2017/2/27']
                 })

我有一个错误说:

TypeError:只允许类似列表的对象传递给isin(),你传递了[str]

我期望的结果将是这样的。

enter image description here

任何帮助将不胜感激!

2 个答案:

答案 0 :(得分:0)

我认为您需要to_datetime,还添加了参数<div class="koostooblock"> <div class="kt1"> <p>Disaineritedele</p> </div> <div class="kt2"> <p>Ehitajatedele</p> </div> <div class="kt3"> <p>Arhitekroritedele</p> </div> </div>,以便将日期时间转换为errors='coerce'(对于pandas中的日期NaT):

NaN

然后创建#if necessary convert to datetime df['closeDate'] = pd.to_datetime(df['closeDate']) df['correctCloseDate'] = pd.to_datetime(df['correctCloseDate'], errors='coerce') Series.mask以替换为掩码,最后删除不需要的列drop

boolean mask

替代方法是使用loc替换:

mask = (df['state'] == '4. Closed') & (df['closeDate'] == '2017-03-27')
df['closeDate'] = df['closeDate'].mask(mask, df['correctCloseDate'])
df = df.drop('correctCloseDate', axis=1)
print (df)
  ID  closeDate         state
0  A 2017-04-12  3. Cancelled
1  B 2017-01-05     4. Closed
2  C 2017-04-01     4. Closed
3  D 2017-04-29  3. Cancelled
4  E 2017-02-27     4. Closed

仅包含字符串的解决方案 - 但随后将mask = (df['state'] == '4. Closed') & (df['closeDate'] == '2017-03-27') df.loc[mask, 'closeDate'] = df['correctCloseDate'] df = df.drop('correctCloseDate', axis=1) print (df) ID closeDate state 0 A 2017-04-12 3. Cancelled 1 B 2017-01-05 4. Closed 2 C 2017-04-01 4. Closed 3 D 2017-04-29 3. Cancelled 4 E 2017-02-27 4. Closed 更改为2017-03-27

2017/3/27

答案 1 :(得分:0)

# you can use numpy.where to locate the rows you need and get the correct date for each row based on your condition and put them back in a new column(or back to correctCloseDate if you want)

df['final_correctCloseDate'] = np.where((df['state'] == '4. Closed') & (df['closeDate'] == '2017-03-27'), df.correctCloseDate, df.closeDate)