Question

在R中，如何检查每行中三列中的任何一列中是否存在列表中的任何值（例如2、3或4），然后在第四列中更改该行？

说我有一个df：

我想写（没有for循环）如果第n行（A或B或C列）== 2或3或4，则D [1，] = 1，否则= 0

如果我的数字出现在三个特定列中的任何一列，请基本上逐行检查；如果是，则将第4列更新为1（如果不是0）。

谢谢

Answer 1

这是您使用dplyr的方法：

library(dplyr)
test <- data.frame(A = c(1, 2, 3), 
                   B = c(1, 1, 1), 
                   C = c(1, 1, 1))

testColumns <- c(2, 3, 4)                         # Values you want to flag

现在我们有了数据框和一个带有要在新列中标记的值的向量，让我们使用rowwise()告诉R来查看数据帧的每一行，然后结合使用mutate()根据各种情况创建一个新列 D 。
我们指定测试用例，然后使用case_when()指定它们的期望值。

让我们使用前向管道%<>%，而不是将管道的结果分配给新对象。

这是我们的操作方式：

test %<>%                                         # Use forward pipe
  rowwise() %>%                                   # Look at test on a 'by row' basis'
  mutate(D = case_when(A %in% testColumns ~ 1,    # use mutate to create a new column D
                       B %in% testColumns ~ 1,
                       C %in% testColumns ~ 1, 
                       TRUE               ~ 0))

这为我们提供了下表：

print(test)
## A tibble: 3 x 4
#      A     B     C     D
#  <dbl> <dbl> <dbl> <dbl>
#1     1     1     1     0
#2     2     1     1     1
#3     3     1     1     1

以下是我们使用的一些功能的有用链接：
mutate()
rowwise()
case_when()

Answer 2

您可以使用apply：

vec <- 2:4
df1$D <- apply(df1,1, function(x) any(vec %in% x)) +0
#   A B C D
# 1 1 1 1 0
# 2 2 1 1 1
# 3 3 1 1 1

或者是tidyverse版本，可能更高效，因为apply涉及一些矩阵转换：

library(tidyverse)
df1 %>% mutate(D = pmap_int(.,~any(vec %in% .)))
#   A B C D
# 1 1 1 1 0
# 2 2 1 1 1
# 3 3 1 1 1

数据

df1 <- data.frame(A = c(1, 2,3), 
                   B = c(1, 1, 1), 
                   C = c(1, 1, 1))

Answer 3

只有这三个条件，您才能做到

数据

2 + 3 = 5

Answer 4

这里是使用data.table的一种方式：

library(data.table)
test <- data.table(A = c(1, 2,3), 
                   B = c(1, 1, 1), 
                   C = c(1, 1, 1))
checkValues <- c(2, 3, 4)

test[, c("D"):= Reduce(`|`, lapply(.SD, function(x){x %in% checkValues}))]

test
   A B C     D
1: 1 1 1 FALSE
2: 2 1 1  TRUE
3: 3 1 1  TRUE

用FALSE=0 | TRUE=1替换Reduce(，, lapply(.SD, function(x){x %in% c(2, 3, 4)}))（用as.numeric(Reduce( | , lapply(.SD, function(x){x %in% c(2, 3, 4)})))替换D | test很容易，您正在使用D来保存逻辑值，因此将其作为逻辑矢量对我来说很有意义。

这也将.bookData{ font-size:13px; column-count: 2; column-gap: 20px; padding: 10px; }更新为具有列{{1}}的引用，这样更有效。

也许还要看的两个答案是：Finding rows containing a value (or values) in any column和Add multiple columns to R data.table in one function call?

Answer 5

在tidyverse中实现的一种方法：

df %>%
 rowid_to_column() %>% #Creating an unique row ID
 gather(var, val, -rowid) %>% #Transforming the data from wide to long
 group_by(rowid) %>% #Grouping
 mutate(D = ifelse(any(val %in% c(2, 3, 4)), 1, 0)) %>% #Testing whether any value from a given row is in the specified list 
 spread(var, val) %>% #Returning the data to wide format
 ungroup() %>%
 select(-rowid) #Deleting the redundant variable

      D     A     B     C
  <dbl> <int> <int> <int>
1    0.     1     1     1
2    1.     2     1     1
3    1.     3     1     1

Answer 6

已针对列名和感兴趣的数字进行了参数化。

library(tidyverse)

data <-
  data.frame(
    A = c(1, 2, 3), 
    B = c(1, 1, 1), 
    C = c(1, 1, 1)
  )

nums <- c(2, 3, 4)
cols <- c('A', 'B', 'C')

data$D <-
  data[, cols] %>%
  map(~.x %in% nums) %>%
  reduce(`|`)

逐行检查列中是否存在值，并按行更新新列

6 个答案: