文件系统支持的数据结构?

时间:2018-03-12 16:21:05

标签: c# .net file

想象一下这样的数据结构:

public class Cat
{
    public string Name;
    public string FavoriteFood;
    public List<Memory> Memories;
}

public class Memory
{
    public string Name;
    public DateTime Date;
    public List<string> Thoughts;
}

有时,Cat会有很多Memories,每个都有很多想法。这可能需要占用大量空间,因此将其保留在内存中可能不是最佳选择。 使用文件和文件夹备份此数据的最佳方法是什么?

这不仅对于内存效率而且对于人类可用性而言非常方便,如果有人想要查看数据的话。理想的文件系统可能如下所示。

\---Cats
    +---Charles
    |   |   cat.json
    |   |
    |   \---Memories
    |       |   eating_food.json
    |       |   sleeping.json
    |       |   biting_some_dude.json
    |
    \---Brumpbo
        |   cat.json
        |
        \---Memories
            |   sleeping.json
            |   sleeping_again.json

cat.json文件可能如下所示:

{
    "name": "Charles",
    "favorite_food": "pant",
    "memories": [
        "eating_food",
        "sleeping",
        "biting_some_dude"
    ]
}

内存文件可能看起来像这样(注意thoughts可能非常长):

{
    "name": "eating_food",
    "date": "2009-01-20T12:00:00.000Z",
    "thoughts": [
        "God, I love pant.",
        "This is some great pant.",
        // ...
        "I am never going to eat ever again.",
        "This was a mistake."
    ]
}

我首次尝试实现此功能是使用IDisposable进行序列化。

public class Cat : IDisposable
{
    public string Name;
    public string FavoriteFood;
    public List<string> Memories;

    // Load a cat if it already exists, or create a new one.
    public Cat(string name)
    {
        if (Storage.DirectoryExists(name))
        {
            var info = Storage.ReadFile<CatInfo>($"{name}/cat.json");
            this.Name = info.Name;
            this.FavoriteFood = info.FavoriteFood;
            this.Memories = info.Memories;
        }
        else
        {
            this.Memories = new List<string>();
        }
    }

    public Memory GetMemory(string name)
    {
        if (this.Memories.Contains(name))
        {
            return new Memory(this, name);
        }
        return null;
    }

    // Serialize and store the cat.
    public void Dispose()
    {
        var info = new CatInfo
        {
            Name = this.Name,
            FavoriteFood = this.FavoriteFood,
            Memories = this.Memories
        };
        Storage.WriteFile("${this.Name}/cat.json", info);
    }
}

public Memory : IDisposable
{
    private readonly Cat cat;

    public string Name;
    public DateTime Date;
    public List<string> Thoughts;

    public Memory(Cat cat, string name)
    {
        if (Storage.FileExists($"{cat.Name}/Memories/{name}.json"))
        {
            var info = Storage.ReadFile<MemoryInfo>($"{cat.Name}/Memories/{name}.json");
            this.Name = info.Name;
            this.Date = info.Date;
            this.Thoughts = info.Thoughts;
        }
        else
        {
            this.Thoughts = new List<string>();
        }
    }

    public void Dispose()
    {
        var info = new MemoryInfo
        {
            Name = this.Name,
            Date = this.Date,
            Thoughts = this.Thoughts
        };
        Storage.WriteFile($"{this.cat.Name}/Memories/{this.Name}.json", info);
    }
}

可能很糟糕,在一个问题出现之前它很有效:线程安全。想象一下:查尔斯Cat发现他喜欢吃“面包”而不是他喜欢吃“喘气”。现在这需要两个变化;一个到Cat.FavoriteFood字段,另外一个Cat.Memories。但是,这两个更改可能由应用程序中的两个单独进程处理。这可能会导致数据丢失。

Thread 1: Charles is loaded to update FavoriteFood.
Thread 2: Charles is loaded to update Memories.
Thread 1: Charles's FavoriteFood is updated to "bread."
Thread 2: Charles's Memories is updated to include "eating_bread."
Thread 1: Charles's data is serialized and written. 
Thread 2: Charles's data is serialized and written.

因为在线程1序列化了Charles最喜欢的食物之后加载了线程2并且之后写了,所以FavoriteFood的更新完全丢失了。

解决这个问题的方法可能是将读取/修改/写入操作移动到每个字段的属性中,但这看起来非常低效,尤其是在考虑具有许多属性的假设数据类型时。

要明确的是,这里的目标是一种线程安全的方法,用于以人类可访问的方式在磁盘上存储数据;这并不一定意味着使用JSON甚至文本文件。什么是最好的解决方案?

1 个答案:

答案 0 :(得分:1)

我认为其中一种适合您正在做得好的模式是使用Repository patternUnitOfWork,这样可以减轻同步数据的问题。使用Entity Framework补充它并使用数据库进行备份将为您的需求提供完全可扩展的解决方案,同时从应用程序本身中获取大量I/O任务。