Question

我需要替换它：

01:05:01:11 --> 01:05:04:07,so you may continue to support us,|bring us health,
$Italic = True
01:05:04:15 --> 01:05:07:09,well-being,
$Italic = False
01:05:07:21 --> 01:05:13:01,and help us to be one big family|and continue working as a team.

基本上成为这个：

1
01:05:01:11 --> 01:05:04:07,so you may continue to support us,|bring us health,
$Italic = True
2
01:05:04:15 --> 01:05:07:09,well-being,
$Italic = False
3
01:05:07:21 --> 01:05:13:01,and help us to be one big family|and continue working as a team.

EDIT_1：这意味着我需要匹配：

' --> '

计算它的出现次数。

EDIT_2：因此，例如，我只需匹配包含以下内容的行：

01:05:04:15 --> 01:05:07:09,

在每个这样的行之前，我需要将上述示例的出现次数插入到文件中。

我已经提出了这个简短的Shell脚本，该脚本利用了＆＃39; sed＆＃39;命令，但处理更大的文件需要很长时间（例如，超过60行）。

# Define the number of the special chars - so you can calculate the number of the subtitle lines
special_chars_no="$(grep -o ' --> ' Output_File | wc -l)"

# Add numbering before every subtitle line
for ((i=1;i<=${special_chars_no};i++)) ;
do
sed -i '/\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9]\) -->/{:1 ; /\(.*\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9]\) -->\)\{'"${i}"'\}/!{N;b1} ; s/\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9]\) -->/'"${i}"'\n\1:\2:\3:\4 -->/'"${i}"' ; :2 ; n ; $!b2}' Output_File  
done

我们可以让它可用（更快）吗？

Answer 1

$ awk '/-->/{print ++cnt} 1' file
1
01:05:01:11 --> 01:05:04:07,so you may continue to support us,|bring us health,
$Italic = True
2
01:05:04:15 --> 01:05:07:09,well-being,
$Italic = False
3
01:05:07:21 --> 01:05:13:01,and help us to be one big family|and continue working as a team.

Answer 2

在使用算术和using a shell loop to process text is not advisable

时，

sed不合适

$ cat ip.txt 
01:05:01:11 --> 01:05:04:07,so you may continue
$Italic = True
01:05:04:15 --> 01:05:07:09,well-being,
$Italic = False
01:05:07:21 --> 01:05:13:01,and help us to be

$ awk '/-->/{$0 = ++i RS $0} 1' ip.txt
1
01:05:01:11 --> 01:05:04:07,so you may continue
$Italic = True
2
01:05:04:15 --> 01:05:07:09,well-being,
$Italic = False
3
01:05:07:21 --> 01:05:13:01,and help us to be

/-->/如果行符合此REGEXP
- $0 = ++i RS $0带有行号的前缀输入记录，并将其与值RS分开，默认为换行符
- i变量将0作为数字上下文中的默认值，++i将在每次与给定的REGEXP匹配时给出递增值
1以惯用方式打印输入记录$0
另见awk save modifications in place

您也可以使用perl

# use perl -i -pe for inplace editing
perl -pe 's/^/++$i . "\n"/e if /-->/' ip.txt
# or, borrowing Ed Morton's simplicity
perl -lpe 'print ++$i if /-->/' ip.txt

Answer 3

这可能适合你（GNU sed）：

sed -r '/-->/{x;:a;s/9(_*)$/_\1/;ta;s/^_*$/0&/;s/$/\n0123456789/;s/([^_])(_*)\n.*\1(.).*/\3\2/;y/_/0/;G;p;s/\n.*//;x;d}' file

在遇到字符串-->时，切换到保留空间（HS）并用9替换任何尾随的_＆＃39}。如果这是第一次，或者所有字符都是0，则添加_。增加最后一个数字，然后将所有_替换为0＆＃39; s。附加图案空间（PS）并打印计数器和当前行。删除当前行，让计数器为下一场比赛做好准备并返回PS。最后删除PS。对于不匹配的行，请正常打印。

Answer 4

你的问题不是那么清楚，看到你的预期输出，跟着awk可能会帮助你。（我已经过awk了，所以在最近的re-interval添加awk它可以删除。）我假设你想在一行上查看一个特定的字符串并打印它的行号。

awk --re-interval '/[0-9]{2}:[0-9]{2}:[0-9]{2}:[0-9]{2}/{print FNR ORS $0}'  Input_file

如果您想在一行之前添加计数，请在上面的代码中将ORS更改为OFS。

如果您需要在Input_file本身中保存输出，那么以下内容也可以帮助您。

awk --re-interval '/[0-9]{2}:[0-9]{2}:[0-9]{2}:[0-9]{2}/{print FNR ORS $0}'  Input_file > temp_file && mv temp_file  Input_file

编辑： 如果您只想在每行之前打印行号，那么以下内容可能对您有所帮助。

awk '{print FNR ORS $0 ORS}'  Input_file

在每个匹配的字符串模式之前放置编号

4 个答案: