Question

新手程序员为您提供Python情况。

我有什么：

包含其他文件夹（模块）和文件的文件夹（可能是.txt，.c，.h，.py等）
一个XML文件，它基本上包含该文件夹的配置（模块名称，短名称，但也包括排除列表。不得考虑排除列表中的那些）

我打算做什么：

从XML文件中读取信息并将其保存在帮助我正确解析的问题中
解析除了被排除的文件夹之外的给定文件夹中的所有文件

到目前为止我的代码看起来像这样：

<?xml version="1.0"?>
<Modules>
    <Module>
        <Name>MOD_Test1</Name>
        <Shortname>1</Shortname>
        <ExcludeList>
            <File>HeaderFile.h</File>
            <File>CFile.c</File>
        </ExcludeList>
    </Module>
    <Module>
        <Name>MOD_Test2</Name>
        <Shortname>2</Shortname>
        <ExcludeList>
            <File>TextFile.txt</File>
        </ExcludeList>
    </Module>
</Modules>

显然是XML文件

def GetExceptFiles(ListOfExceptFiles = []):
    tree = ET.ElementTree(file='Config.xml')
    Modules = tree.getroot()
    for Module in Modules:
        for Attribute in Module:
            if Attribute.tag=='Name':
                ModuleName = Attribute.text
            if Attribute.tag=='Shortname':
                ModuleShortName = Attribute.text
            for File in Attribute:
                ExceptFileName = File.text
                print ('In module {} we must exclude {}'.format(ModuleName, ExceptFileName))
        if ExceptFileName is not None:        
            ListOfExceptFiles.append(ExceptFileName)

这个会读取XML文件，并为我提供必须排除的文件列表。这样做很好，但很糟糕。让我们说两个模块的文件名完全相同，一个被排除，一个不被排除。他们都会被跳过。

def Parse(walk_dir):
print('walk_dir = ' + walk_dir)
for root, subdirs, files in os.walk(walk_dir):
    print('-------------------------------------------------------\nroot = ' + root)
    for filename in files:
        with open(os.path.join(root, filename), 'r') as src:
            Text = src.read()
            print ('\nFile %s contains: \n' %filename) + Text

现在解析这是我开始使用的。我知道，它没有解析，但是一旦我可以阅读文件的内容，那么我当然可以做其他事情。

至于删除例外文件，我所做的只是向第二个FOR

for filename in files:
        if filename not in ListOfExceptFile:
            with open(os.path.join(root, filename), 'r') as src:

这是他们做得不对的两件事：

同名文件会损坏输出。
在xml中有多个文件除外（对于一个模块）将导致只跳过最后一个文件。（在我的示例中，不会跳过HeaderFile.h，CFile.c将会这样）

def ReadConfig(Tuples = []):
tree = ET.ElementTree(file='Config.xml')
Modules = tree.getroot()
for Module in Modules:
    for Attribute in Module:
        if Attribute.tag=='Name':
            ModuleName = Attribute.text
        for File in Attribute:
            ExceptFileName = File.text
            Tuple = (ModuleName, ExceptFileName)
            Tuples.append(Tuple)

这是一种接近的好方法吗？

Answer 1

工作非常好，只需要一小部分调整来解决问题：

1）在您的app.get('/test', (req, res)=>{ let csvStream = byline(fs.createReadStream('./resources/onescsv.csv')); csvStream.on('data', (line)=>{ csvStream.pipe(res); }); csvStream.on('end', () => { res.render('./test/test', { css:['test/test.css'], js:['test/test.js'] }) }) });中，您将文件添加到for GetExceptFiles(ListOfExceptFiles = [])末尾的列表中。这导致只添加最后一个文件的事实。在移动文件中的for内部检查时，应将所有排除的文件添加到列表中。（几个标签/空格就足够了）

Attribute

此外，您假设属性的标记只能是def GetExceptFiles(ListOfExceptFiles = []): tree = ET.ElementTree(file='Config.xml') Modules = tree.getroot() for Module in Modules: for Attribute in Module: if Attribute.tag=='Name': ModuleName = Attribute.text if Attribute.tag=='Shortname': ModuleShortName = Attribute.text for File in Attribute: ExceptFileName = File.text print ('In module {} we must exclude {}'.format(ModuleName, ExceptFileName)) if ExceptFileName is not None: ListOfExceptFiles.append(ExceptFileName)，Name或Shortname。虽然这肯定会是这种情况，但是格式错误的文件会破坏您的解析。考虑检查所有attrbiutes的tag属性，并在出现错误时发出错误。

2）我假设具有相同名称的文件实际上是模块之间共享的相同文件，这些模块在某些模块中被排除但不是全部。如果是这种情况，那么排除文件列表将丢失有关排除lsit属于哪个模块的信息。考虑使用带有模块名称的列表映射作为键，以便每个模块都有自己的排除文件列表。

编辑一种使用ExcludeList的方法（我主要是面向java的，这种结构在java中称为map，但在python中是dict）可能是：

dict

请注意，这假设ModuleName已经在第一个文件之前设置，这取决于组件的orded，这是XML不能保证的。为了解决这个问题，我将名称和短名称从子标记移动到模块的XML属性，如下所示：

def GetExceptFiles(DictOfExceptFiles = {}):
    tree = ET.ElementTree(file='Config.xml')
    Modules = tree.getroot()
    for Module in Modules:
        for Attribute in Module:
            if Attribute.tag=='Name':
                ModuleName = Attribute.text
            if Attribute.tag=='Shortname':
                ModuleShortName = Attribute.text
            for File in Attribute:
                ExceptFileName = File.text
                if(ModuleName not in DictOfExceptFiles)
                    DictOfExceptFiles[ModuleName] = []
                DictOfExceptFiles[ModuleName].append(ExceptFileName)
                print ('In module {} we must exclude {}'.format(ModuleName, ExceptFileName))

使用Python解析文件夹中的所有文件，除了在XML文件中键入的文件

1 个答案: