用Python原子写入文件

时间:2010-02-25 12:21:42

标签: python file-io atomic

我使用Python在一次操作中将文本块写入文件:

open(file, 'w').write(text)

如果脚本中断导致文件写入未完成,我希望没有文件而不是部分完整的文件。可以这样做吗?

7 个答案:

答案 0 :(得分:84)

将数据写入临时文件,并在成功写入数据后,将文件重命名为正确的目标文件,例如

f = open(tmpFile, 'w')
f.write(text)
# make sure that all data is on disk
# see http://stackoverflow.com/questions/7433057/is-rename-without-fsync-safe
f.flush()
os.fsync(f.fileno()) 
f.close()

os.rename(tmpFile, myFile)

根据文件http://docs.python.org/library/os.html#os.rename

  

如果成功,重命名将是一个原子操作(这是一个   POSIX要求)。在Windows上,如果是dst   已存在,将引发OSError   即使它是一个文件;可能没有   实现原子重命名的方法   dst命名现有文件

  

如果src和dst在不同的文件系统上,则某些Unix风格的操作可能会失败。

注意:

  • 如果src和dest位置不在同一文件系统上,则可能不是原子操作

  • 如果在电源故障,系统崩溃等情况下性能/响应性比数据完整性更重要,则可以跳过
  • os.fsync步骤

答案 1 :(得分:16)

使用Python tempfile实现原子写入的简单代码段。

with open_atomic('test.txt', 'w') as f:
    f.write("huzza")

甚至读写同一个文件:

with open('test.txt', 'r') as src:
    with open_atomic('test.txt', 'w') as dst:
        for line in src:
            dst.write(line)

使用两个简单的上下文管理器

import os
import tempfile as tmp
from contextlib import contextmanager

@contextmanager
def tempfile(suffix='', dir=None):
    """ Context for temporary file.

    Will find a free temporary filename upon entering
    and will try to delete the file on leaving, even in case of an exception.

    Parameters
    ----------
    suffix : string
        optional file suffix
    dir : string
        optional directory to save temporary file in
    """

    tf = tmp.NamedTemporaryFile(delete=False, suffix=suffix, dir=dir)
    tf.file.close()
    try:
        yield tf.name
    finally:
        try:
            os.remove(tf.name)
        except OSError as e:
            if e.errno == 2:
                pass
            else:
                raise

@contextmanager
def open_atomic(filepath, *args, **kwargs):
    """ Open temporary file object that atomically moves to destination upon
    exiting.

    Allows reading and writing to and from the same filename.

    The file will not be moved to destination in case of an exception.

    Parameters
    ----------
    filepath : string
        the file path to be opened
    fsync : bool
        whether to force write the file to disk
    *args : mixed
        Any valid arguments for :code:`open`
    **kwargs : mixed
        Any valid keyword arguments for :code:`open`
    """
    fsync = kwargs.get('fsync', False)

    with tempfile(dir=os.path.dirname(os.path.abspath(filepath))) as tmppath:
        with open(tmppath, *args, **kwargs) as file:
            try:
                yield file
            finally:
                if fsync:
                    file.flush()
                    os.fsync(file.fileno())
        os.rename(tmppath, filepath)

答案 2 :(得分:6)

有一个简单的AtomicFile助手:https://github.com/sashka/atomicfile

答案 3 :(得分:5)

我正在使用此代码以原子方式替换/写入文件:

import os
from contextlib import contextmanager

@contextmanager
def atomic_write(filepath, binary=False, fsync=False):
    """ Writeable file object that atomically updates a file (using a temporary file).

    :param filepath: the file path to be opened
    :param binary: whether to open the file in a binary mode instead of textual
    :param fsync: whether to force write the file to disk
    """

    tmppath = filepath + '~'
    while os.path.isfile(tmppath):
        tmppath += '~'
    try:
        with open(tmppath, 'wb' if binary else 'w') as file:
            yield file
            if fsync:
                file.flush()
                os.fsync(file.fileno())
        os.rename(tmppath, filepath)
    finally:
        try:
            os.remove(tmppath)
        except (IOError, OSError):
            pass

用法:

with atomic_write('path/to/file') as f:
    f.write("allons-y!\n")

它基于this recipe

答案 4 :(得分:5)

由于很容易弄乱细节,我建议使用一个小型库。图书馆的优势在于它可以处理所有这些细节,并且由社区提供reviewed and improved

untitaker 就是python-atomicwrites一个这样的库,它甚至还有适当的Windows支持:

来自自述文件:

from atomicwrites import atomic_write

with atomic_write('foo.txt', overwrite=True) as f:
    f.write('Hello world.')
    # "foo.txt" doesn't exist yet.

# Now it does.

答案 5 :(得分:0)

此页面上的答案很旧,现在有可以为您完成此操作的库。

特别是safer是一个旨在帮助防止程序员错误破坏文件,套接字连接或通用流的库。它非常灵活,除其他功能外,它还可以选择使用内存或临时文件,甚至可以在出现故障的情况下保留临时文件。

他们的例子正是您想要的:

# dangerous
with open(filename, 'w') as fp:
    json.dump(data, fp)
    # If an exception is raised, the file is empty or partly written
# safer
with safer.open(filename, 'w') as fp:
    json.dump(data, fp)
    # If an exception is raised, the file is unchanged.

它在PyPI中,只需使用pip install --user safer安装它,或在https://github.com/rec/safer上获取最新版本

答案 6 :(得分:-1)

Windows的Atomic解决方案,用于循环文件夹和重命名文件。经过测试,原子自动化,您可以增加概率,以最大限度地降低风险,而不是具有相同文件名的事件。随机库中的字母符号组合使用random.choice方法,用于数字str(random.random.range(50,999999999,2)。您可以根据需要改变数字范围。

import os import random

path = "C:\\Users\\ANTRAS\\Desktop\\NUOTRAUKA\\"

def renamefiles():
    files = os.listdir(path)
    i = 1
    for file in files:
        os.rename(os.path.join(path, file), os.path.join(path, 
                  random.choice('ABCDEFGHIJKL') + str(i) + str(random.randrange(31,9999999,2)) + '.jpg'))
        i = i+1

for x in range(30):
    renamefiles()
相关问题