如何使用R以排序顺序读取目录中的文件?

时间:2012-05-27 21:18:26

标签: r

下面给出的代码运行良好,读取我目录中的文件并提取值:

X <- c(75:85) ; Y <- c(208:215) 
extract <- vector()
files <- list.files("C:\\New folder (10)", "*.img",full.names=TRUE)

}

我试图通过使用sprintf来指定,但是我收到了错误。请帮忙:

for (i in c(1:365)) {
   fileName <- sprintf("C:New folder (10)/Climate_Rad_%d.img", i)
}

4 个答案:

答案 0 :(得分:5)

啊,好吧,我知道,问题在于排序。排序顺序按字母顺序排列。 Climate_Rad_1按字母顺序排列为Climate_Rad_10,而不是Climate_Rad_2 订单根本不是“随机”,按字母顺序排列正确。

但是,您希望在Climate_Rad_10之前处理Climate_Rad_2,而不是之后。有几种方法可以解决这个问题。首先,您应该注意Climate_Rad_002在Climate_Rad_010之前按字母顺序排列,因此如果您在生成文件时添加前导零,这将使以后按数字顺序处理文件变得容易。

或者,假设您在创建文件时无法添加零。然后至少有两种方法可以按顺序访问文件。之后通过在文件名中添加零,或者只是对文件名的数字部分进行排序。

让我告诉你后者。

myFiles <- paste("Climate_Rad_", c(1:15, 95:110), ".img", sep = "") # create some test names, you get the actual myFiles through a call to list.files()

myFiles.sorted <- sort(myFiles) # this gives the alphabetic sorting, not what you want

> myFiles.sorted
 [1] "Climate_Rad_1.img"   "Climate_Rad_10.img"  "Climate_Rad_100.img"
 [4] "Climate_Rad_101.img" "Climate_Rad_102.img" "Climate_Rad_103.img"
 [7] "Climate_Rad_104.img" "Climate_Rad_105.img" "Climate_Rad_106.img"
[10] "Climate_Rad_107.img" "Climate_Rad_108.img" "Climate_Rad_109.img"
[13] "Climate_Rad_11.img"  "Climate_Rad_110.img" "Climate_Rad_12.img" 
[16] "Climate_Rad_13.img"  "Climate_Rad_14.img"  "Climate_Rad_15.img" 
[19] "Climate_Rad_2.img"   "Climate_Rad_3.img"   "Climate_Rad_4.img"  
[22] "Climate_Rad_5.img"   "Climate_Rad_6.img"   "Climate_Rad_7.img"  
[25] "Climate_Rad_8.img"   "Climate_Rad_9.img"   "Climate_Rad_95.img" 
[28] "Climate_Rad_96.img"  "Climate_Rad_97.img"  "Climate_Rad_98.img" 
[31] "Climate_Rad_99.img" 

# split between the part that comes before the numerics and the "1.img" etc.--adjust appropriately
split <- strsplit(myFiles.sorted, "Climate_Rad_") 
# strip the "1.img" etc such that only the numeric part is left
# turn the characters in numeric
split <- as.numeric(sapply(split, function(x) x <- sub(".img", "", x[2])))
# not you can sort, by using order, that gives the original filenames, ordered on the numeric part of the filename
myFiles.correct.order <- myFiles.sorted[order(split)]

 [1] "Climate_Rad_1.img"   "Climate_Rad_2.img"   "Climate_Rad_3.img"  
 [4] "Climate_Rad_4.img"   "Climate_Rad_5.img"   "Climate_Rad_6.img"  
 [7] "Climate_Rad_7.img"   "Climate_Rad_8.img"   "Climate_Rad_9.img"  
[10] "Climate_Rad_10.img"  "Climate_Rad_11.img"  "Climate_Rad_12.img" 
[13] "Climate_Rad_13.img"  "Climate_Rad_14.img"  "Climate_Rad_15.img"   
[16] "Climate_Rad_95.img"  "Climate_Rad_96.img"  "Climate_Rad_97.img" 
[19] "Climate_Rad_98.img"  "Climate_Rad_99.img"  "Climate_Rad_100.img"
[22] "Climate_Rad_101.img" "Climate_Rad_102.img" "Climate_Rad_103.img"
[25] "Climate_Rad_104.img" "Climate_Rad_105.img" "Climate_Rad_106.img"
[28] "Climate_Rad_107.img" "Climate_Rad_108.img" "Climate_Rad_109.img"
[31] "Climate_Rad_110.img"

这将按照您要查找的顺序为您提供文件。现在根据它来拉取文件,例如由

for (fileNames in myFiles.correct.order) {READ.IN.AND.DO.YOUR.THING}

应该这样做。确保根据文件名调整“Climate_Rad_”和“.img”(您可能还需要在“Climate_Rad_”之前添加路径,以使其类似于“C:/ filefolder / Climate_Rad_”,如果是需要)。

答案 1 :(得分:3)

为什么不用list.files()检索所有文件(使用特定模式),然后对其进行排序。然后从排序的向量中检索文件,这些文件以正确的排序顺序为您提供。这具有以下优点:当1:365序列中缺少数字时它也可以工作

类似的东西:

myFiles <- list.files(pattern = "^Climate_Rad_") #all files starting with Climate_
myFiles <- sort(myFiles)
# then read them in, for instance through
for (fileNames in myFiles) {READ.IN.AND.DO.YOUR.MAGIC.ON.THEM}

答案 2 :(得分:3)

gtools包具有&#34; mixedsort &#34;这可以帮助你。

示例 myFiles 取自Peter Verbeet回答。

myFiles <- paste("Climate_Rad_", c(1:15, 95:110), ".img", sep = "") 

install.packages ('gtools')
require ('gtools')

mixedsort (myFiles)

[1] "Climate_Rad_1.img"   "Climate_Rad_2.img"  
[3] "Climate_Rad_3.img"   "Climate_Rad_4.img"  
[5] "Climate_Rad_5.img"   "Climate_Rad_6.img"  
[7] "Climate_Rad_7.img"   "Climate_Rad_8.img"  
[9] "Climate_Rad_9.img"   "Climate_Rad_10.img" 
[11] "Climate_Rad_11.img"  "Climate_Rad_12.img" 
[13] "Climate_Rad_13.img"  "Climate_Rad_14.img" 
[15] "Climate_Rad_15.img"  "Climate_Rad_95.img" 
[17] "Climate_Rad_96.img"  "Climate_Rad_97.img" 
[19] "Climate_Rad_98.img"  "Climate_Rad_99.img" 
[21] "Climate_Rad_100.img" "Climate_Rad_101.img"
[23] "Climate_Rad_102.img" "Climate_Rad_103.img"
[25] "Climate_Rad_104.img" "Climate_Rad_105.img"
[27] "Climate_Rad_106.img" "Climate_Rad_107.img"
[29] "Climate_Rad_108.img" "Climate_Rad_109.img"
[31] "Climate_Rad_110.img"

答案 3 :(得分:1)

正如我在上面的评论中提到的,您可以使用paste来简化您的代码。尝试:

# set the working directory
setwd("C:New folder (10)")

# construct a vector of file names and loop through
for (f in paste("Climate_Rad_", 1:365, ".img", sep="")) {
    conne <- file(f, "rb")
    # do rest, assuming it's correct
}