压缩xml文件的Subversion diff

时间:2009-09-01 07:34:42

标签: svn diff zip meld

我正在使用MySQL Workbench维护应用程序的数据库架构。 Workbench使用的.mwb文件是一个压缩的XML文档,保存在Subversion存储库中。

Subversion将文件视为二进制数据,因此我无法使用svn diff来显示更改,例如在提交之前。

由于数据确实是XML,我认为可能有某种方式可以显示差异,可能是之前解压缩文件的脚本,或者svn diff的一些插件。

理想的解决方案是:

$ svn diff db-model.mwb

甚至使用Meld:

$ meld db-model.mwb

您能想到什么方法来实现这一目标?也许其他人在Subversion中存在显示存档文本文件差异的问题。

2 个答案:

答案 0 :(得分:8)

Subversion允许您使用external differencing tools。您可以做的是编写一个包装器脚本,并告诉Subversion将其用作“diff”命令。你的包装器将解析它从Subversion获取的参数,以选择“左”和“右”文件名,操作 在他们身上,并返回Subversion将其解释为成功或失败的错误代码。在您的情况下,包装器可以解压缩XML文件,并将解压缩的结果传递给 “差异”或您选择的其他工具。

Subversion将不会检查在签入时检测为“二进制”的文件。“ - force”选项允许您覆盖此检查,因此即使输入文件也会运行包装器脚本以二进制文件形式签入。

答案 1 :(得分:2)

我为workbench文件编写了一个diff脚本,可以与TortoiseSVN和TortoiseGit集成,这将完全按照Jim Lewis的建议:从存档中提取实际的XML并进行区分。

该脚本还将消除diff中的所有 ptr -Attribute噪声。合并是不可能的,并且会更复杂一些(发现 ptr - 属性如何表现,将XML重新打包到存档中,存档中的其他元数据是什么?,... )

python脚本在CC-BY 3.0下的pastebin中可用:

http://pastebin.com/AcD7dBNH

# extensions: mwb
# TortoiseSVN Diff script for MySQL Workbench scheme files
# 2012 by Oliver Iking, Z-Software GmbH, oliverikingREPLACETHISWITHANATz-software.net, http://www.z-software.net/
# This work is licensed under a Creative Commons Attribution 3.0 Unported License - http://creativecommons.org/licenses/by/3.0/

# Will produce two diffable documents, which don't resemble the FULL MWB content, but the scheme relevant data. 
# Merging is not possible

# Open your TortoiseSVN (or TortoiseSomething) settings, go to the "Diff Viewer" tab and click on "Advanced". Add 
# a row with the extension ".mwb" and a command line of 
# "path\to\python.exe" "path\to\diff-mwb.py" %base %mine
# Apply changes and now you can diff mysql workbench scheme files

import sys
import zipfile
import os
import time
import tempfile
import re

# mysql workbench XML will have _ptr_ attributes which are modified on each save for almost each XML node. Remove the visual litter, 
# make actual changes stand out.
def sanitizeMwbXml( xml ):
    return re.sub('_ptr_="([0-9a-fA-F]{8})"', '', xml)

try:
    if len(sys.argv) < 2:
        print("Not enough parameters, cannot diff documents!")
        sys.exit(1)

    docOld = sys.argv[1]
    docNew = sys.argv[2]

    if not os.path.exists(docOld) or not os.path.exists(docNew):
        print("Documents don't exist, cannot diff!")
        sys.exit(1)

    # Workbench files are actually zip archives
    zipA = zipfile.ZipFile( docOld, 'r' )
    zipB = zipfile.ZipFile( docNew, 'r' )

    tempSubpath = os.tempnam(None,"mwbcompare")

    docA = os.path.join( tempSubpath, "mine.document.mwb.xml" )
    docB = os.path.join( tempSubpath, "theirs.document.mwb.xml" )

    os.makedirs( tempSubpath )

    if os.path.exists(docA) or os.path.exists(docB):
        print("Cannot extract documents, files exist!")
        sys.exit(1)

    # Read, sanitize and write actual scheme XML contents to temporary files

    docABytes = sanitizeMwbXml(zipA.read("document.mwb.xml" ))
    docBBytes = sanitizeMwbXml(zipB.read("document.mwb.xml" ))

    docAFile = open(docA, "w")
    docBFile = open(docB, "w")

    docAFile.write(docABytes)
    docBFile.write(docBBytes)

    docAFile.close()
    docBFile.close()

    os.system("TortoiseProc /command:diff /path:\"" + docA + "\" /path2:\"" + docB + "\"");

    # TortoiseProc will spawn a subprocess so we can't delete the files. They're in the tempdir, so they
    # will be cleaned up eventually
    #os.unlink(docA)
    #os.unlink(docB)

    sys.exit(0)
except Exception as e:
    print str(e)
    # Sleep, or the command window will close
    time.sleep(5)