我大约有100个文件夹,每个文件夹中应读取和分析的文件。
我可以从子文件夹中读取文件,但是我想从以下位置开始处理:第10个文件夹,直到结尾。而且我需要确切的文件夹路径。
我该怎么做?
为澄清我的问题,我从代码中提取了一个示例:
rootDir = 'D:/PhD/result/Pyradiomic_input/'
for (path, subdirs, files) in os.walk(rootDir):
sizefile=len(path)
if "TCGA-" in path :
print(path)
输出为:
D:/PhD/result/Pyradiomic_input/TCGA-02-0006
D:/PhD/result/Pyradiomic_input/TCGA-02-0009
D:/PhD/result/Pyradiomic_input/TCGA-02-0011
D:/PhD/result/Pyradiomic_input/TCGA-02-0027
D:/PhD/result/Pyradiomic_input/TCGA-02-0046
D:/PhD/result/Pyradiomic_input/TCGA-02-0069
现在我的问题是如何从例如D:/PhD/result/Pyradiomic_input/TCGA-02-0046
直到结束,而不是从头开始?我尝试了一些想法,但是没有用。
答案 0 :(得分:1)
您可以设置一个标记,以在命中特定目录时捕获
rootDir = 'D:/PhD/result/Pyradiomic_input/'
first_folder = 'TCGA-02-0046'
process = False
for (path, subdirs, files) in os.walk(rootDir):
sizefile=len(path)
if "TCGA-" in path :
print(path)
if first_folder in path:
process = True
if process:
#process folder
如果您想要特定的文件夹来指示脚本应停止处理
rootDir = 'D:/PhD/result/Pyradiomic_input/'
first_folder = 'TCGA-02-0046'
last_folder = 'TCGA-02-0099'
process = False
for (path, subdirs, files) in os.walk(rootDir):
sizefile=len(path)
if "TCGA-" in path :
print(path)
if first_folder in path:
process = True
if last_folder in path:
break
if process:
#process folder
您还可以设置要处理的目录列表
rootDir = 'D:/PhD/result/Pyradiomic_input/'
process_dirs = ['TCGA-02-0046', ...]
process = False
for (path, subdirs, files) in os.walk(rootDir):
sizefile=len(path)
if "TCGA-" in path :
print(path)
if any(d in path for d in process_dirs):
#process folder
答案 1 :(得分:0)
您可以简单地跳过不需要的值。这里有些简化:
counter = 0
# mocking the file operations
for path in ["/dir-1", "/dir-2", "/dir-3", "/dir-4", "/dir-5"]:
# skip the first two paths
if counter < 2:
counter += 1
continue
# do something
print(path)
或者,您可以先收集路径,如下所示:
paths = []
# mocking the file operations
for path in ["/dir-1", "/dir-2", "/dir-3", "/dir-4", "/dir-5"]:
# collect paths in array
paths.append(path)
# skip the first two elements
paths = paths[2:]
for path in paths:
# do something
print(path)
如果使用生成器表达式,第二个版本可能会短一些,但我更喜欢可读性。