搜索并替换此类案例的模式

时间:2014-11-16 15:02:20

标签: regex

我正在寻找以下案例的解决方案:

原文:

= This is the first line =
= This is the first line again
== This is the second line ==
=== This is the third line ===
==== This is the forth line ====
==== The tailing '='s are optional, but if they're present, should be removed

预期结果:

h1. This is the first line
h1. This is the first line again
h2. This is the second line
h3. This is the third line
h4. This is the forth line
h4. The tailing '='s are optional, but if they're present, should be removed

任何语言都可以(python,perl,bash更好)。

1 个答案:

答案 0 :(得分:0)

通过Python。

import re
with open('input.txt', 'r') as f:                                   # Open the text file for reading
    for line in f:                                                  # iterate through all the lines  
        if line.startswith('='):                                    # Do the below operation only on the lines which starts with `=` symbol.
            finalcount = line.split()[0].count('=')                 # Count the number of `=` symbols located on the first word only.
            line = re.sub(r'=+$', r'', line)                        # Remove all the trailing `=` symbols if it's presented.
            line = re.sub(r'^\S+', r'h'+str(finalcount)+'.', line)  # Replace the first word in the lines which startswith `=` with `h` plus the value stores in finalcount variable plus a dot. 
            print(line, end='')                                     # Now print the modified line.

<强>输出:

h1. This is the first line 
h1. This is the first line again
h2. This is the second line 
h3. This is the third line 
h4. This is the forth line 
h4. The tailing '='s are optional, but if they're present, should be removed
相关问题