我有一个包含两列的数据集:
Quantity SKU
1,1 2494008,2493953
1,1,1 2167550,1336380,2365409
3,2,1,6,1 1428608,1137956,2401393,2679310,2579183
结束状态是一个如下所示的数据集:
Quantity SKU
1 2494008
1 2493953
1 2167550
1 1336380
1 2365409
3 1428608
2 1137956
1 2401393
6 2679310 1 2579183
cplit和strsplit如果要分割单个变量here,则可以工作,但我需要拆分两个变量(上面的数量和SKU)。
答案 0 :(得分:0)
dat <- read.table(text="Quantity SKU
1,1 2494008,2493953
1,1,1 2167550,1336380,2365409
3,2,1,6,1 1428608,1137956,2401393,2679310,2579183", header=TRUE, stringsAsFactors=FALSE)
dat2<-data.frame(Quantity = unlist(strsplit(dat$Quantity, split=",")),
SKU=unlist(strsplit(dat$SKU, split=",")), row.names = NULL)
dat3 <- as.data.frame(do.call(cbind, lapply(dat, function(x) unlist(strsplit(x, ",")))))
# Quantity SKU
# 1 1 2494008
# 2 1 2493953
# 3 1 2167550
# 4 1 1336380
# 5 1 2365409
# 6 3 1428608
# 7 2 1137956
# 8 1 2401393
# 9 6 2679310
# 10 1 2579183
答案 1 :(得分:0)
不出所料,data.table
解决方案非常类似于提议的基础R解决方案by lmo:
library(data.table)
data.table(dat)[, lapply(.SD, function(x) unlist(strsplit(x, ",")))]
Quantity SKU 1: 1 2494008 2: 1 2493953 3: 1 2167550 4: 1 1336380 5: 1 2365409 6: 3 1428608 7: 2 1137956 8: 1 2401393 9: 6 2679310 10: 1 2579183
如果需要,可以保留行号:
data.table(dat)[, rn := .I][, lapply(.SD, function(x) unlist(strsplit(x, ","))), rn]
rn Quantity SKU 1: 1 1 2494008 2: 1 1 2493953 3: 2 1 2167550 4: 2 1 1336380 5: 2 1 2365409 6: 3 3 1428608 7: 3 2 1137956 8: 3 1 2401393 9: 3 6 2679310 10: 3 1 2579183
dat <- structure(list(Quantity = c("1,1", "1,1,1", "3,2,1,6,1"), SKU = c("2494008,2493953",
"2167550,1336380,2365409", "1428608,1137956,2401393,2679310,2579183"
)), .Names = c("Quantity", "SKU"), class = "data.frame", row.names = c(NA, -3L))
答案 2 :(得分:0)
来自separate_rows
的{{1}}:
tidyr