使用R将列名插入其值

时间:2018-03-08 12:32:48

标签: r text dplyr data.table reshape2

我需要在其值中插入Column Name,Department。我有这样的代码:

Department <- c("Store1","Store2","Store3","Store4","Store5")
Department2 <- c("IT1","IT2","IT3","IT4","IT5")
x <- c(100,200,300,400,500)
Result <- data.frame(Department,Department2,x)
Result

预期结果如下:

Department <- c("Department_Store1","Departmentz_Store2","Department_Store3","Department_Store4","Department_Store5")
Department2 <- c("Department2_IT1","Department2_IT2","Department2_IT3","Department2_IT4","Department2_IT5")
x <- c(100,200,300,400,500)
Expected.Result <- data.frame(Department,Department2,x)
Expected.Result

有人可以帮忙吗?感谢

3 个答案:

答案 0 :(得分:2)

dplyrtidyr的另一种方式:

library(dplyr)
library(tidyr)

# Convert to character to avoid warning message, will convert all columns to character
Result[] <- lapply(Result, as.character)

Result %>%
  mutate_if(is.factor, as.character) %>% # optional, only convert factor to character, retain all other types
  gather(key, value, -x) %>% 
  mutate(var = paste(key, value, sep = "_")) %>% 
  select(-value) %>% 
  spread(key,var)

    x        Department     Department2
1 100 Department_Store1 Department2_IT1
2 200 Department_Store2 Department2_IT2
3 300 Department_Store3 Department2_IT3
4 400 Department_Store4 Department2_IT4
5 500 Department_Store5 Department2_IT5

数据:

Result <- data.frame(
  Department = c("Store1","Store2","Store3","Store4","Store5"),
  Department2 = c("IT1","IT2","IT3","IT4","IT5"),
  x = c(100,200,300,400,500)
)

答案 1 :(得分:1)

如果您将有问题的列名收集到向量conf = SparkConf() conf = conf.setAppName(appName).setMaster("local[4]") sc = SparkContext(conf=conf) data = sc.textFile(PATH_TO_VECTORS, use_unicode=False) parsedData = data.map(lambda line: np.asarray([float(x) for x in line.split()])) 中,这是一个带有for循环的干净dep_col解决方案:

base R

答案 2 :(得分:1)

如果我理解正确,OP希望在"Department"开头的所有列中添加相应列名的值。

编辑根据OP的请求,选择列的代码已被推广以选择其他列名。

以下是使用data.table快速set()函数的解决方案:

library(data.table)
setDT(Result)
cols <- stringr::str_subset(names(Result), "^(Department|Division|Team)")
for (j in cols) {
  set(Result, NULL, j, paste(j, Result[[j]], sep = "_"))
}
Result
          Department     Department2   x
1: Department_Store1 Department2_IT1 100
2: Department_Store2 Department2_IT2 200
3: Department_Store3 Department2_IT3 300
4: Department_Store4 Department2_IT4 400
5: Department_Store5 Department2_IT5 500

请注意set()按引用更新,即不复制整个对象。