我有一个包含大量文件的目录;每个文件都具有相同的结构:
Nodes: 6606
Edges: 382386
Average degree: 115.76930063578565
Average clustering: 0.11213868344294504
Modularity: 0.6021229084216876
Giant component: 6598
使用list.files()
函数,我读了目录的内容:
files <- list.files(path = "test", pattern = "netstat*", full.names = TRUE)
然后我使用lapply()
函数将文件读入数据框列表:
data1 <- lapply(files, read.table, sep = ":", row.names = 1)
最后,我将列表转换为数据框并重命名行名称:
data2 <- t(do.call(data.frame, data1))
rownames(data2) <- 1:nrow(data)
最终数据如下:
> head(data2)
Nodes Edges Average degree Average clustering Modularity Giant component
1 6606 382386 115.769301 0.11213868 0.6021229 6598
2 5157 20292 7.869692 0.07020251 0.8195294 5125
3 5177 20148 7.783658 0.07640135 0.9030172 5102
4 5689 29559 10.391633 0.08480404 0.7104452 5626
5 5985 32086 10.722139 0.06803845 0.7189815 5938
6 5829 26449 9.074970 0.05963236 0.7061715 5770
我的问题:有更优雅的方式吗?特别是最后一个命令 - 我手动重命名行 - 在某种程度上不符合优雅的R编程。
答案 0 :(得分:1)
我们可以使用fread
阅读文件,并将list
data.table
个data.table
转换为rbindlist
个library(data.table)
rbindlist(lapply(files, fread))
Acid Exposure (pH) Total
Total Normal
Clearance pH : Channel 7
Number of Acid Episodes 26
Time 31.5 min
Percent Time 7.4%
Mean Acid Clearance Time 73 sec
Longest Episode 7.1 min
Gastric pH : Channel 8
Time pH<4.0 425.9 min
Bolus Exposure (Impedance) Total
Total Normal
Acid Time 22.0 min
Acid Percent Time 5.2%
Nonacid Time 6.1 min
Nonacid Percent Time 1.4%
All Reflux Time 28.2 min
All Reflux Percent Time 6.6%
Median Bolus Clearance Time 16 sec
Longest Episode 7.8 min