根据文件中的文本分割文本文件

时间:2019-01-31 19:09:53

标签: powershell

我有一个很大的text(.txt)文件,其中有几个需要单独存储的文件。

在每个文档的开头都有一个标题,可用来引用开头。

这时我想开始新文件,并命名该文件为数字(递增)

BONUS POINTS !:解析刚刚损坏的文件,并获取一些文本示例:“ Doc No. 1”作为文件名。

我尝试了这一点以及其他一些建议,但都没有运气。 https://forums.windowssecrets.com/showthread.php/174836-Powershell-Split-a-Text-File-Output-With-Delimiter-As-File-Name

  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA

  ADDRESS CORRECTION REQUESTED                  Document No.         1

                                                period:
                                                DATE thru DATE

EXAMPLE DATA                    EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA


EXAMPLE DATA

          XXXXXXXXXXXX                             XXXX





  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA

  ADDRESS CORRECTION REQUESTED                  Document No.         2

                                                period:
                                                DATE thru DATE

EXAMPLE DATA                    EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA


EXAMPLE DATA

          XXXXXXXXXXXX                             XXXX






  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA

  ADDRESS CORRECTION REQUESTED                  Document No.         3

                                                period:
                                                DATE thru DATE

EXAMPLE DATA                    EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA


EXAMPLE DATA

          XXXXXXXXXXXX                             XXXX






  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA

  ADDRESS CORRECTION REQUESTED                  Document No.         4

                                                period:
                                                DATE thru DATE

EXAMPLE DATA                    EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA


EXAMPLE DATA

          XXXXXXXXXXXX                             XXXX

1 个答案:

答案 0 :(得分:0)

给出当前文件夹中的文件SplitText.txt

> Get-Content .\SplitText.txt
xxx FirstFile zzz
FirstFile line 1
FirstFile line 2
FirstFile line 3
FirstFile line 4
FirstFile line 5
FirstFile line 6
xxx SecondFile zzz
SecondFile line A
SecondFile line B
SecondFile line C
SecondFile line D

此脚本会将其拆分为附加在BaseName上的编号部分:

## Q:\Test\2019\01\31\SO_54467665.ps1
$File = Get-Item ".\SplitText.txt"
$i = 0
(Get-Content $File -raw) -split 'xxx .*? zzz\r?\n' -ne ''| ForEach-Object {
    $i++
    $_ | Set-Content -Path {"{0}\{1}_{2}{3}" -f `
         $File.DirectoryName, $File.BaseName, $i, $File.Extension}
}

> Get-Content .\SplitText_1.txt
FirstFile line 1
FirstFile line 2
FirstFile line 3
FirstFile line 4
FirstFile line 5
FirstFile line 6

> Get-Content .\SplitText_2.txt
SecondFile line A
SecondFile line B
SecondFile line C
SecondFile line D