如何获取每行中变量的总数

时间:2018-06-01 05:03:46

标签: r

我有一个类似于名称sp

的数据框
myFunction(sheet: string) {
    document.getElementById('theme').setAttribute('href', sheet);
}

数据帧继续100行,其中p1是列sp1指示的物种数,依此类推。 现在我想创建一个新的变量pine,它计算树种的总数 松树在每一行(加入)

2 个答案:

答案 0 :(得分:0)

按行执行简单的apply即可。我使用grep对data.frame进行子集化,以获取以"sp"开头的列。

pine <- apply(sp[grep("^sp", names(sp))], 1, function(x) sum(x == "pine"))
pine
#[1] 0 1 1

数据。

sp <- 
structure(list(Join = 1:3, p1 = 0:2, sp1 = structure(c(1L, 2L, 
2L), .Label = c("0", "pine"), class = "factor"), p2 = c(0L, 0L, 
0L), sp2 = c(0L, 0L, 0L), p3 = c(0L, 1L, 0L), sp3 = structure(c(1L, 
2L, 1L), .Label = c("0", "Aspen"), class = "factor")), class = "data.frame", row.names = c(NA, 
-3L))

答案 1 :(得分:0)

您可以长格式转换数据以执行计算。一旦数据格式为长格式,fuzzyjoin::regex_inner_join将允许加入配对值的数据(例如p1 vs sp1)。

使用tidyverse的选项可以是:

library(tidyverse)
library(fuzzyjoin)         

#To calculate count of Species per row for different type

df %>% gather(Species, value, -Join) %>% 
  mutate(Join = as.character(Join))  %>% {
    regex_inner_join(filter(., grepl("^s",Species)),
              filter(.,grepl("^p",Species)),
              by = c("Join", "Species"))
} %>%
  filter(value.x != "0") %>%
  group_by(Join.x, value.x) %>%
  summarise(count = sum(as.numeric(value.y))) %>% as.data.frame()

#   Join.x value.x count
# 1      2   Aspen     1
# 2      2    pine     1
# 3      3    pine     2

#To calculate count of Species per row 
df %>% gather(Species, value, -Join) %>% 
  mutate(Join = as.character(Join))  %>% {
    regex_inner_join(filter(., grepl("^s",Species)),
              filter(.,grepl("^p",Species)),
              by = c("Join", "Species"))
} %>%
group_by(Join.x) %>%
summarise(count = sum(as.numeric(value.y))) %>% as.data.frame()

#   Join.x count
# 1      1     0
# 2      2     2
# 3      3     2

数据:

df <- read.table(text = 
"Join      p1     sp1       p2      sp2     p3      sp3
1          0        0           0         0        0          0
2          1        pine     0         0       1         Aspen
3           2        pine     0        0       0          0",
header = TRUE, stringsAsFactors = FALSE)