递归扫描文件并删除python中的空目录

时间:2014-02-24 23:15:01

标签: python file recursion directory delete-file

我有以下结构:

Dir 1
|___Dir 2
   |___file 1
   |___file 2...
Dir 3
|___Dir 4
   |___file 3...

我希望能够以递归方式找到每个文件,以我自己的方式处理文件,一旦完成,删除文件,移到下一个。然后,如果目录为空,那么删除它,继续向上,直到没有剩下任何东西。

不知道如何继续。

这就是我所拥有的:

for root, dirs, files in os.walk(dir):
    path = root.split('/')
    for file in files:
        file = os.path.join(root, file)
        process_file(file)
        os.remove(file)

哪个好,但我想删除子目录,只要它们是空的。

6 个答案:

答案 0 :(得分:5)

好吧,我想这样做,必须通过os.walk运行...

def get_files(src_dir):
# traverse root directory, and list directories as dirs and files as files
    for root, dirs, files in os.walk(src_dir):
        path = root.split('/')
        for file in files:
            process(os.path.join(root, file))
                    os.remove(os.path.join(root, file))

def del_dirs(src_dir):
    for dirpath, _, _ in os.walk(src_dir, topdown=False):  # Listing the files
        if dirpath == src_dir:
            break
        try:
            os.rmdir(dirpath)
        except OSError as ex:
            print(ex)


def main():
    get_files(src_dir)
    del_dirs(src_dir)


if __name__ == "__main__":
    main()

答案 1 :(得分:4)

我意识到这篇文章比较老,添加一个额外的例子可能没有意义,但是我一眼就认为初学者比其他人更容易掌握,因为没有加入,它只导入一个模块,它给出了如何使用一些内置函数的好例子[open()& len()]和使用str.format的新Python3字符串格式。它还显示了使用file = filename在print()函数中填充文件的简单内容。

此脚本将使用os.walk()扫描根目录,检查目录和文件的长度,并根据找到的内容执行条件。它还增加一个计数器以确定使用的目录数量和数量。为空,它将信息输出到文件。我在Python 3.4中编写了这个例子,它适用于我的目的。如果有人有改进逻辑的想法,请在评论中发帖,这样我们都可以学习解决问题的新视角。

import os
#declare the root directory
root_dir = 'C:\\tempdir\\directory\\directory\\'
#initialize the counters
empty_count = 0
used_count = 0
#Set the file to write to. 'x' will indicate to create a new file and open it for writing
outfile = open('C:\\tempdir\\directories.txt', 'x')
for curdir, subdirs, files in os.walk(root_dir):
    if len(subdirs) == 0 and len(files) == 0: #check for empty directories. len(files) == 0 may be overkill
        empty_count += 1 #increment empty_count
        print('Empty directory: {}'.format(curdir), file = outfile) #add empty results to file
        os.rmdir(curdir) #delete the directory
    elif len(subdirs) > 0 and len(files) > 0: #check for used directories
        used_count += 1 #increment used_count
        print('Used directory: {}'.format(curdir), file = outfile) #add used results to file

#add the counters to the file
print('empty_count: {}\nused_count: {}'.format(empty_count, used_count), file = outfile) 
outfile.close() #close the file

答案 2 :(得分:1)

这是我认为有效的另一种解决方案。当然,使用os.scandir可以提高效率。

首先,我定义了一个通用rec_rmdir函数(递归rmdir),它以递归方式浏览目录树。

  • 该函数首先处理每个文件和每个子目录。
  • 然后它尝试删除当前目录。
  • 保留标志用于保留根目录。

该算法是经典的Depth-first search

import os
import stat


def rec_rmdir(root, callback, preserve=True):
    for path in (os.path.join(root, p) for p in os.listdir(root)):
        st = os.stat(path)
        if stat.S_ISREG(st.st_mode):
            callback(path)
        elif stat.S_ISDIR(st.st_mode):
            rec_rmdir(path, callback, preserve=False)
    if not preserve:
        try:
            os.rmdir(root)
        except IOError:
            pass

然后,很容易定义一个处理文件并将其删除的函数。

def process_file_and_remove(path):
    # process the file
    # ...
    os.remove(path)

经典用法:

rec_rmdir("/path/to/root", process_file_and_remove)

答案 3 :(得分:0)

这只是用于删除空目录以及删除目录的单个文件。它似乎只回答了问题的一部分,抱歉。

我在最后添加了一个循环以继续尝试,直到找不到它为止。我让函数返回已删除目录的数量。

我的访问被拒绝错误由:shutil.rmtree fails on Windows with 'Access is denied'

修复
import os
import shutil


def onerror(func, path, exc_info):
    """
    Error handler for ``shutil.rmtree``.

    If the error is due to an access error (read only file)
    it attempts to add write permission and then retries.

    If the error is for another reason it re-raises the error.

    Usage : ``shutil.rmtree(path, ignore_errors=False, onerror=onerror)``
    """
    import stat

    if not os.access(path, os.W_OK):
        # Is the error an access error ?
        os.chmod(path, stat.S_IWUSR)
        func(path)
    else:
        raise


def get_empty_dirs(path):
    # count of removed directories
    count = 0
    # traverse root directory, and list directories as dirs and files as files
    for root, dirs, files in os.walk(path):
        try:
            # if a directory is empty there will be no sub-directories or files
            if len(dirs) is 0 and len(files) is 0:
                print u"deleting " + root
                # os.rmdir(root)
                shutil.rmtree(root, ignore_errors=False, onerror=onerror)
                count += 1
            # if a directory has one file lets pull it out.
            elif len(dirs) is 0 and len(files) is 1:
                print u"moving " + os.path.join(root, files[0]) + u" to " + os.path.dirname(root)
                shutil.move(os.path.join(root, files[0]), os.path.dirname(root))
                print u"deleting " + root
                # os.rmdir(root)
                shutil.rmtree(root, ignore_errors=False, onerror=onerror)
                count += 1
        except WindowsError, e:
            # I'm getting access denied errors when removing directory.
            print e
        except shutil.Error, e:
            # Path your moving to already exists
            print e
    return count


def get_all_empty_dirs(path):
    # loop till break
    total_count = 0
    while True:
        # count of removed directories
        count = get_empty_dirs(path)
        total_count += count
        # if no removed directories you are done.
        if count >= 1:
            print u"retrying till count is 0, currently count is: %d" % count
        else:
            break

    print u"Total directories removed: %d" % total_count
    return total_count


count = get_all_empty_dirs(os.getcwdu())  # current directory
count += get_all_empty_dirs(u"o:\\downloads\\")  # other directory
print u"Total of all directories removed: %d" % count

答案 4 :(得分:0)

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "server.settings")
from whitenoise.django import DjangoWhiteNoise

美好而简单。关键是在try语句下使用os.removedirs。它已经递归了。

答案 5 :(得分:0)

看起来聚会晚了。尽管如此,这是另一个可以帮助初学者的解决方案。

进口

import os

from contextlib import suppress

包含适当的功能

# Loop for processing files
for root, _, files in os.walk(dir):
    path = root.split('/')
    for file in files:
        file = os.path.join(root, file)

        # Assuming process_file() returns True on success
        if process_file(file):
            os.remove(file)

包含适当的功能

# Loop for deleting empty directories
for root, _, _ in os.walk(dir):
        # Ignore directory not empty errors; nothing can be done about it if we want
        # to retain files that failed to be processsed. The entire deletion would
        # hence be silent.
        with suppress(OSError):
            os.removedirs(root)