有条件地替换NA

时间:2020-09-10 02:08:40

标签: r dplyr

这是我自己的question的增强版,因为我无法通过评论清楚地解释它

只有2个farms,因此每个fruit在下面的df中重复。我只想将NA的{​​{1}}替换为0的值,例如pear的{​​{1}}的值为{{1} },我想改为输出y2019

样本数据:

c(NA, 7)

这很近

c(0,7)

但是:

  1. df <- data.frame(fruit = c("apple", "apple", "peach", "peach", "pear", "pear", "lime", "lime"), farm = as.factor(c(1,2,1,2,1,2,1,2)), 'y2019' = c(NA,NA,3,12,NA,7,4,6), 'y2018' = c(5,3,NA,NA,8,2,NA,NA),'y2017' = c(4,5,7,15,NA,NA,1,NA)) > df fruit farm y2019 y2018 y2017 1 apple 1 NA 5 4 2 apple 2 NA 3 5 3 peach 1 3 NA 7 4 peach 2 12 NA 15 5 pear 1 NA 8 NA 6 pear 2 7 2 NA 7 lime 1 4 NA 1 8 lime 2 6 NA NA df %>% group_by(fruit) %>% mutate(across(where(is.numeric), ~ if (any(is.na(.))) 0 else .)) %>% ungroup() 中被淘汰,产生了7

  2. 我想在两个场均为pear

    时离开c(0,0)
    NA

所需结果:

NA

2 个答案:

答案 0 :(得分:1)

您可以尝试:

library(dplyr)

df %>%
  group_by(fruit) %>%
  mutate(across(where(is.numeric), ~ if(any(!is.na(.))) 
                replace(., is.na(.), 0)  else .)) %>%
  ungroup()

# A tibble: 8 x 5
#  fruit farm  y2019 y2018 y2017
#  <chr> <fct> <dbl> <dbl> <dbl>
#1 apple 1        NA     5     4
#2 apple 2        NA     3     5
#3 peach 1         3    NA     7
#4 peach 2        12    NA    15
#5 pear  1         0     8    NA
#6 pear  2         7     2    NA
#7 lime  1         4    NA     1
#8 lime  2         6    NA     0

因此,只有在组中存在任何非replace的值时,我们NA NA才能设为0。

答案 1 :(得分:1)

如果存在replace_na个非NA元素要替换为0或tidyr返回值,我们可以使用any中的else

library(dplyr)
library(tidyr)
df %>%
  group_by(fruit) %>%
  mutate(across(where(is.numeric), ~ if(any(!is.na(.))) replace_na(., 0) else .)) %>%
   ungroup()
# A tibble: 8 x 5
#  fruit farm  y2019 y2018 y2017
#  <chr> <fct> <dbl> <dbl> <dbl>
#1 apple 1        NA     5     4
#2 apple 2        NA     3     5
#3 peach 1         3    NA     7
#4 peach 2        12    NA    15
#5 pear  1         0     8    NA
#6 pear  2         7     2    NA
#7 lime  1         4    NA     1
#8 lime  2         6    NA     0

或另一种不带if/else的选项,通过对“水果”进行分组后在replace中具有两个逻辑表达式

df %>%
     group_by(fruit) %>%
      mutate(across(where(is.numeric),
           ~ replace(., sum(!is.na(.)) > 0 & is.na(.), 0)))
# A tibble: 8 x 5
# Groups:   fruit [4]
#  fruit farm  y2019 y2018 y2017
#  <chr> <fct> <dbl> <dbl> <dbl>
#1 apple 1        NA     5     4
#2 apple 2        NA     3     5
#3 peach 1         3    NA     7
#4 peach 2        12    NA    15
#5 pear  1         0     8    NA
#6 pear  2         7     2    NA
#7 lime  1         4    NA     1
#8 lime  2         6    NA     0
相关问题