使用批处理文件编辑XML

时间:2009-12-22 14:33:24

标签: xml batch-file

我想知道是否有任何方法可以创建可以编辑XML文档中的行的批处理文件。该行将由前一行标识。这个想法如下:

If line == Csetting name="BaseDirectory" serializeAs="String">
    Next line = <value>User Input from begining of batch</value>

是类似的东西,甚至是无形的,还是我梦寐以求的?感谢您的帮助和答案。

5 个答案:

答案 0 :(得分:6)

你可能可能在一个工作以某种方式的批处理文件中一起攻击某些东西。但这将是非常痛苦的。首先,我知道无法可靠地将行读入批处理文件中的变量并将其写回文件中。您可以逃避大多数有问题的字符(例如<>&|,...)但仍然存在无法解决的问题< sup> 1 (例如不匹配的引号)会导致此类尝试失败。然后你仍然无法解析XML,但你宁愿使用原始文本处理,只要使用单引号而不是双引号就可能很容易失败。或者在某个地方抛出额外的空间。或者您要查找的行分为几行。所有有效的XML,但是当没有XML解析器时,解析起来很痛苦。

批处理文件语言并不适合此类任务。哎呀,它几乎不适用于文本处理,但XML远远超出了它。使用VBScript和MSXML甚至PowerShell(如果适用),您可能会有更多的运气(和乐趣)。

VBScript可能是最理智的选择,因为几乎可以在任何现代Windows机器上使用它。

您也可以使用XSLT并从命令行调用它。有足够的XSLT处理器可供使用,生成XSLT文件实际上要简单得多(但仍需要多次转义)。


1 请注意,我可能是高级批处理文件用户/程序员,但绝不具有权威性。也许这很容易实现,而且我看起来太愚蠢了。

答案 1 :(得分:6)

我实际上有一个答案。是的,这是痛苦的,但我有一个类似的问题,我实际上并不知道VBScript(虽然我正在计划学习它...)虽然我的问题发生在一个同事有一个客户有20,000个文件,他们从外部数据的转换。所有文件都是xml,它们都缺少相同的第二行XML,它触发了我们导入文档的重新编译。

我在StackOverflow上找到了另一个标准批处理脚本,它允许我将文件分成两部分,然后在它们之间插入我想要的代码。现在我唯一的问题(可能是由于懒惰或我缺乏知识/耐心)是我无法摆脱&lt; ,&gt;问题。该脚本一直认为我试图写入一个无效的文件。我尝试了各种方法来使用该角色,但我希望它以变量的形式出现。不用说,我让它工作(甚至)......

以下是我提供给同事的自述文件,以及每个文件的代码。

<强>的README.txt 问题:         大量文件缺少一个字符串或一段代码,需要进行编辑

解决方案:         这个工具拆分文件并注入一个字符串或一段代码,然后将文件放回另一个位置。

此工具共有4个文件。

    **1 - _README.txt       - This file describes how to use the script
    **2 - insert.txt            - This file contains the text that will be inserted into the file you need edited.
    **3 - InsertString.bat      - This file contains the actual script that loops to restructure the file. Here you will find all the variables that need to be set to make this work.
    **4 - String_Insert_Launcher.bat    - This file is what you will launch to run the InsertString.bat file.

您需要做什么:

  1. 编辑String_Insert_Launcher并将此文件放在包含要编辑的文件的目录中。 注意 此文件必须与您要编辑的所有其他文件位于同一文件夹中。 您需要编辑此文件中的变量以匹配文件系统         batchpath

  2. 编辑InsertString.bat并将此文件放在您设置上面的batchpath变量的同一目录中 您需要编辑此文件中的变量以匹配您的文件系统         insertpath         destpath         top_last_line         insert_last_line         bot_last_line

  3. 编辑insert.txt并将此文件放在您设置上面的insertpath的同一目录中 您需要将要插入的字符串放入此文本文档

  4. 检查日志并确保“Modified_Filelist.txt”(位于%insertpath%中)中的文件数与您开始使用的文件数相同。

  5. 文件细分:


    * insert.txt *


    在此文件中,您需要将要插入的文本放入要定位的文件中。使用单独文件的原因是,特殊字符(&gt;,&lt;,/,\,|,^,%等等)不会被视为批处理文件中的参数。 此文件必须与您在InsertString.bat中设置的名为“insertpath”的变量位于同一位置,或者在批处理文件中作为%insertpath%引用。


    * InsertString.bat *


    在此文件中,您将找到需要为脚本设置的变量。     变量包括:

                **1. filelist - This sets the counter for counting how many files were edited *this should not be edited*
            **2. insertpath - This sets the path of insert.txt file containing the string you want to insert into the files that will be edited. If this location does not exist it will create it.
            **3. destpath - This sets the path for the location of the files after they're edited. If this location does not exist it will create it.
            **4. top_last_line - This sets the LAST GOOD LINE of the file that will be edited before the insert.txt is added. In essence this will split the file into 2 parts and add the contents of " insert.txt " into the middle of those 2 parts.
            **5. insert_last_line - This sets the number of lines to add to the file from insert.txt (i.e. if insert_last_line=2 then the top two lines will be added after top_last_line)
            **6. bot_last_line - This sets the last line of the original file (i.e. if there are 25 lines in the original file bot_last_line should be 25 - always over esitimate this, because if this number is less than the original not all lines will be rewritten to the new file)
    

    此文件与您将在String_Insert_Launcher.bat中设置的变量位于同一位置,称为“batchpath”,或在批处理文件中作为%batchpath%引用。


    * String_Insert_Launcher.bat *


    这是您将执行的编辑所有文件的脚本。从包含要编辑的文件的文件夹中启动此批处理脚本。此文件获取所有文件名并在所有这些文件上运行InsertString.bat。 在这个文件中你会发现一个可以设置脚本工作的变量。     变量包括:         batchfilepath - 这是执行所有工作的实际批处理文件的位置。这个位置只是文件路径,不包括任何文件名。

    文件编号1:String_Insert_Launcher.bat

    @ECHO off
    TITLE Insert String to XML Script Launch File
    COLOR 02
    
    set batchfilepath=C:\JHA\Synergy\insertpath
    REM This is the location of the actual batch file that does all of the work. This location is JUST the filepath, not including any filenames.
    IF NOT exist  %batchfilepath% md %batchfilepath% 
    IF NOT exist %batchfilepath%\InsertString.bat goto pause
    
    :run
    for /f "delims=" %%f in ('dir /b /a-d-h-s') do "%batchfilepath%\InsertString.bat" %%f
    REM This command string gets the names of all of the files in the directory it's in and then runs the InsertString.bat file against every file individually.
    
    :pause
    cls
    echo.The file InsertString.bat is not in the correct directory.
    echo.Please put this file in the location listed below:
    echo.
    echo.-------------------------
    echo.%batchfilepath%
    echo.-------------------------
    echo.
    echo.When this file has been added press any key to continue running the script.
    pause
    goto run
    
    REM Insert String to XML Script
    REM Created by Trevor Giannetti
    REM An unpublished work
    

    文件#2:Insert_String.bat

    @ECHO off
    TITLE Insert String to XML Script
    COLOR 02
    SETLOCAL enabledelayedexpansion
    
    REM From Command Line:              for /f "delims=" %f in ('dir /b /a-d-h-s') do InsertString.bat %f
    
    REM ---------------------------
    REM   *** EDIT VARIABLES BELOW ***
    REM ---------------------------
    
    set insertpath=C:\JHA\Synergy\insertpath
    REM This sets the path of insert.txt file containing the string you want to insert into the files that will be edited. If this location does not exist it will create it.
    set destpath=C:\JHA\Synergy\destination
    REM This sets the path for the location of the files after they're edited. If this location does not exist it will create it.
    set top_last_line=1
    REM This sets the LAST GOOD LINE of the file to be edited before the insert.txt is added. In essence this will split the file into 2 parts and add the contents of " insert.txt " into the middle of those 2 parts.
    set insert_last_line=1
    REM This sets the number of lines to add to the file from insert.txt (i.e. if insert_last_line=2 then the top two lines will be added after top_last_line)
    set bot_last_line=25
    REM This sets the last line of the original file (i.e. if there are 25 lines in the original file bot_last_line should be 25 - always over esitimate this, because if this number is less than the original not all lines will be rewritten to the new file)
    
    REM ---------------------------
    REM  *** DO NOT EDIT BELOW ***
    REM ---------------------------
    
    set filelist=0
    REM This sets the counter for counting how many files were edited
    IF '%1'=='' goto usage
    
    IF NOT exist %insertpath% md %insertpath%
    IF NOT exist %destpath% md %destpath%
    
    :top_of_file
    IF EXIST %destpath%\%1 set done=T
    IF EXIST %destpath%\%1 goto exit
    IF '%1'=='InsertString.bat' goto exit
    IF '%1'=='insert.txt' goto exit
    IF '%1'=='Modified_Filelist.txt' goto exit
    IF '%1'=='String_Insert_Launcher.bat'  goto exit
    set /a FirstLineNumber = 1
    REM This is the first line in the file that you want edited
    set /a LastLineNumber = %top_last_line%
    REM This is the last line in the file that you want edited
    
    SET /a counter=1
    
    for /f "usebackq delims=" %%a in (%1) do (
        if !counter! GTR !LastLineNumber! goto next
        if !counter! GEQ !FirstLineNumber! echo %%a >>  %destpath%\%1
        set /a counter+=1
    )
    
    goto next
    
    :next
    REM echo TEXT TO BE INSERTED >> %destpath%\%1
    REM goto bottom_of_file
    REM The above can be substituted for the rest of :next if you don't have special characters in the text you need inserted
    
    set /a FirstLineNumber = 1
    REM This is the first line in the file with the text you need inserted in the file you want edited
    set /a LastLineNumber = %insert_last_line%
    REM This is the last line in the file with the text you need inserted in the file you want edited
    
    SET /a counter=1
    for /f "usebackq delims=" %%a in (%insertpath%\insert.txt) do (
        if !counter! GTR !LastLineNumber! goto next
        if !counter! GEQ !FirstLineNumber! echo %%a >>  %destpath%\%1
        set /a counter+=1
    )
    REM The %insertpath%\insert.txt is the name of the file with the text you want inserted into the file you want edited
    
    goto bottom_of_file
    
    :bottom_of_file
    set /a FirstLineNumber = 1+%top_last_line%
    REM This is the first line in the second part of the file with the text you need inserted in the file you want edited
    set /a LastLineNumber = %bot_last_line%
    REM This is the last line in the second part of the file with the text you need inserted in the file you want edited
    REM The above is the split, after the top_of_file. The rest of the contents of the original file will be added after the text you want inserted is appended to the file
    
    SET /a counter=1
    
    for /f "usebackq delims=" %%a in (%1) do (
        if !counter! GTR !LastLineNumber! goto exit
        if !counter! GEQ !FirstLineNumber! echo %%a >>  %destpath%\%1
        set /a counter+=1
    )
    
    goto logging
    
    :logging
    IF NOT EXIST %insertpath%\Modified_Filelist.txt echo Modified File List: > %insertpath%\Modified_Filelist.txt
    for /f "tokens=1 delims=[]" %%a in ('find /v /c "" ^< %insertpath%\Modified_Filelist.txt') do (
    echo %%a - %1 >> %insertpath%\Modified_Filelist.txt
    )
    
    goto exit
    
    :usage
    cls
    echo Usage: InsertString.bat FILENAME 
    echo You are missing the file name in your string
    
    :exit
    IF '%done%'=='T' echo %1 Already exists in folder!
    IF '%done%'=='T' echo Not modifying %1
    IF '%done%'=='T' echo Moving on to next file...
    IF EXIST %destpath%\InsertString.bat del %destpath%\InsertString.bat
    IF EXIST %destpath%\insert.txt del %destpath%\insert.txt
    
    REM Insert String to XML Script
    REM Created by Trevor Giannetti
    REM An unpublished work
    

    文件#3:Insert.txt

    <Vocabulary="Conv">
    

    在您的情况下,您可以使用2个文件...一个使用<value>,另一个使用</value> (我知道这很草率,但它会起作用......) 然后从我的批处理脚本InsertString.bat你只需要:下一个循环2x(每个文件一个),在它们之间你将放置echo。%userInputFromBeginningofBatch%&gt;&gt; File.xml

    就像我说的那样,我知道这很麻烦,你可以在VBScript中轻松搞定,但对于我们这些不了解它的人来说,这是一个有效的解决方案。

答案 2 :(得分:2)

对不起。我提前为这篇文章道歉。我知道这是一个非常古老的话题,但在阅读了这里的答案后,我无法抗拒发布这个答案的诱惑。

通过批处理程序处理XML文件不仅简单直接,而且在我看来,比VBScript,PowerShell等中的任何等效解决方案更容易。这里是:

@echo off
setlocal EnableDelayedExpansion
set "greater=>"
set targetLine=Csetting name="BaseDirectory" serializeAs="String"!greater!
echo Enter the new line to insert below target lines:
set /P nextLine=
setlocal DisableDelayedExpansion

(for /F "delims=" %%a in (document.xml) do (
   set "line=%%a"
   setlocal EnableDelayedExpansion
   echo !line!
   if "!line!" equ "!targetLine!" echo !nextLine!
   endlocal
)) > newDocument.xml

以前程序的唯一问题是它从XML文件中删除了空行,但是可以通过添加更多命令以非常简单的方式修复此细节。可以修改以前的程序以不检查完整的行(如最初请求的OP),但以与上一个VBScript示例相同的方式检查三个部分:

(for /F "delims=" %%a in (document.xml) do (
   set "line=%%a"
   setlocal EnableDelayedExpansion
   echo !line!
   set lineMatch=1
   if "!line:Csetting name=!" equ "!line!" set lineMatch=
   if "!line:BaseDirectoy=!" equ "!line!" set lineMatch=
   if "!line:serializeAs=!" equ "!line!" set lineMatch=
   if defined lineMatch echo !nextLine!
   endlocal
)) > newDocument.xml

答案 3 :(得分:1)

当然,您可以使用批处理,但我建议您学习并使用vbscript

Set objFS=CreateObject("Scripting.FileSystemObject")
strFile = WScript.Arguments.Item(0)
strUserValue= WScript.Arguments.Item(1)
Set objFile = objFS.OpenTextFile(strFile)
Do Until objFile.AtEndOfStream
    strLine = objFile.ReadLine
    If  InStr(strLine,"Csetting name") >0 And _
        InStr(strLine,"BaseDirectory")> 0 And _
        InStr(strLine,"serializeAs=") > 0 Then      
        strLine=strLine & vbCrLf & "<value>" & strUserValue & "</value>"        
    End If 
    WScript.Echo strLine
Loop

将脚本保存为edit.vbs并在批处理中

c:\test> cscript //nologo edit.vbs file "user value"
如果您讨厌使用其他工具(如gawk / sed / Python / Perl或其他xml解析器/编写器),那么vbscript是除了cripple批处理之外的最佳选择。否则,您应该考虑使用这些更好的工具。

答案 4 :(得分:1)

XML不是基于行的,因此假设您可以通过逐行检查来查找文件中的某些内容,这可能容易出现问题,或者依赖于除XML之外的其他假设。 (如果你从某种类型的软件中获取文件,你怎么知道它总是以这种特殊方式产生输出线?)

话虽如此,我会看一下内置JSDBE4X Javascript。 E4X使得操作XML变得特别简单,只要你能将它全部读入内存即可;它不是基于流的系统。虽然您可以在没有E4X的情况下使用JSDB并使用流处理文件I / O:

var Sin = new Stream('file://c:/tmp/testin.xml');
var Sout = new Stream('file://c:/tmp/testout.xml','w');
while (!Sin.eof)
{
   var Lin = Sin.readLine();
   var Lout = some_magic_function(Lin); // do your processing here
   Sout.writeLine(Lout);
}
Sin.close(); Sout.close();