使用数字序列批量编辑多个文本文件

时间:2014-11-09 15:35:50

标签: xml batch-file

我有100个.xml文件(基于电视节目)按顺序命名:

s07e01.xml
s07e02.xml
s07e03.xml
s07e04.xml

季节和剧集数量(每季)不同。

在每个文件中有两行:

<ID></ID>
<EpisodeNumber></EpisodeNumber>

是否可以批量编辑这些文件,将剧集编号添加到这两个元素中?

感谢。

4 个答案:

答案 0 :(得分:1)

这是一个bash脚本:

#! /bin/bash

for f in *.xml ; do
    n=${f##*/s}; n=${n#*e}; n=${n%.xml}
    echo "File $f --> episode $n" >&2
    mv -f "$f" "$f.bak"
    while IFS= read -r line ; do
        if [[ "$line" == *"<ID>"*"</ID>"* ]]; then
            echo -e "${line%%[^ ]*}<ID>$n</ID>\r"
        elif [[ "$line" == *"<EpisodeNumber>"*"</EpisodeNumber>"* ]]; then
            echo -e "${line%%[^ ]*}<EpisodeNumber>$n</EpisodeNumber>\r"
        else
            echo -e "$line\r"
        fi
    done < "$f.bak" >| "$f"
done

答案 1 :(得分:1)

@echo off
setlocal EnableDelayedExpansion

rem Process all .xml files
for %%f in (*.xml) do (
   rem Get season and episode in %%a and %%b
   for /F "tokens=1,2 delims=se." %%a in ("%%f") do (
      rem Get the numbers of both target lines
      set "repLines=/"
      for /F "delims=:" %%c in ('findstr "<ID> <EpisodeNumber>" "%%f"') do (
         set "repLines=!repLines!%%c/"
      )
      rem Initialize the (first) replacement string
      set "replace=<ID>%%a</ID>"
      rem Process the file, replace values, create new file
      (for /F "tokens=1* delims=:" %%c in ('findstr /N "^" "%%f"') do (
         rem If this is a target line
         if "!repLines:/%%c/=!" neq "!repLines!" (
            rem Do the replacement
            echo !replace!
            rem And change to next (second) replacement string
            set "replace=<EpisodeNumber>%%b</EpisodeNumber>"
         ) else (
            rem Output the line unchanged
            setlocal DisableDelayedExpansion
            set "line=%%d"
            setlocal EnableDelayedExpansion
            echo(!line!
            endlocal & endlocal
         )
      )) > "%%~Nf.tmp"
   )
)

rem Update files
del *.xml
ren *.tmp *.xml

之前的解决方案假设只有两行,<ID></ID><EpisodeNumber></EpisodeNumber>值按此顺序排列。如果不是这样,则需要进行少量修改。

答案 2 :(得分:0)

简单的批处理脚本:

@echo off

REM rename all files with matching patterns to tmp-files:
ren s??e??.xml *.tmp

REM for all tmp-files do:
for /f "tokens=*" %%f in ('dir /b *.tmp') do (
  REM get season and episode:
  for /f "tokens=1,2 delims=SsEe." %%i in ("%%~nf") do (
    REM write new xml file:
    >%%~dpnf.xml echo ^<ID^>%%i^</ID^>
    >>%%~dpnf.xml echo ^<EpisodeNumber^>%%j^</EpisodeNumber^>
  )
)
REM delete tmp files:
del *.tmp

答案 3 :(得分:0)

使用REPL.BAT有一个非常有效和优雅的解决方案 - 一个混合JScript /批处理实用程序,它在stdin上执行正则表达式搜索/替换,并将结果写入stdout。 REPL.BAT是纯脚本,可​​以在任何Windows机器上从XP开始本地运行。完整的文档内置于脚本中。

我使用REPL.BAT两次。首先修改DIR / B的输出,过滤掉与名称模板不匹配的行,并提取季节和剧集值。结果由FOR / F处理。然后对于每个文件,第二个REPL.BAT修改实际文件并将其写入临时文件。最后,临时文件MOVEd为原始文件名。第二次REPL一次性完成两次替换。替换值是一个JScript表达式,用于确定要插入的值,具体取决于匹配的标记名称。

此脚本将处理当前文件夹中的所有文件:

@echo off
for /f "delims=: tokens=1,2*" %%A in (
  'dir /b /a-d s??e*.xml^|repl "^s(\d\d)e(\d\d)" "$1:$2:$&" ia'
) do (
  type "%%C"|repl "(<(ID|EpisodeNumber)>).*?(</\2>)" "$1+($2=='ID'?'%%A':'%%B')+$3" j >"%%C.new"
  move /y "%%C.new" "%%C" >nul
)

此第二个版本将处理整个文件夹层次结构。它只需要稍微修改DIR命令和初始REPL搜索字符串:

for /f "delims=: tokens=1,2*" %%A in (
  'dir /b /s /a-d s??e*.xml^|repl "^.*\\s(\d\d)e(\d\d)" "$1:$2:$&" ia'
) do (
  type "%%C"|repl "(<(ID|EpisodeNumber)>).*?(</\2>)" "$1+($2=='ID'?'%%A':'%%B')+$3" j >"%%C.new"
  move /y "%%C.new" "%%C" >nul
)