在界限内抓取词语

时间:2014-02-17 03:48:03

标签: regex tcl words

问题: 表达正则表达式来抓取两个边界之间的单词。下面的代码不起作用

regexp -- {/b/{(.+)/}}/b} $outputline8 - filtered

目标

  1. 抓住位于后面的所有引脚名称xxx/xxx[x] set_false_path以及{}之间。
  2. 在set_false_path中可能还有另一个选项,例如“-through”,我仍然希望在这些选项之后抓住这些引脚,并将这些引脚放入输出文件中,如下所述。
  3. 这是我的输入文件:input_file.txt

    set_false_path -from [get_ports {AAAcc/BBB/CCC[1] \
    BBB_1/CCC[1] CCC/DDD[1] \
    DDD/EEE EEE/FFF[1] \
    FFF/GGG[1]}] -through\
    [get_pins {GGG/HHH[1] HHH/III[1] \
    XXX/YYY[1] YYY/XXX[1] \
    AAA/ZZZ[1]}]
    set_timing_derate -cell_sdada [get_cells \
    {NONO[1]}
    set_false_path -from [get_ports {AAA/DDD[2]}]
    

    这是输出文件(我期望的格式):output_file.txt

    AAAcc/BBB/CCC[1]
    BBB_1/CCC[1]
    CCC/DDD[1]
    DDD/EEE
    EEE/FFF[1]
    FFF/GGG[1]
    GGG/HHH[1]
    HHH/III[1]
    XXX/YYY[1]
    YYY/XXX[1]
    AAA/ZZZ[1]
    AAA/DDD[2]
    

    一般来说,这些引脚没有任何一般模式。因此,唯一的方法是获取{}之间的所有引脚。

    从上面的输入文件中,我们可以看到那些set_命令(来自input.txt)没有连接在一个句子中。所以我制作的代码只会抓取set_false path中的内容并加入这些行,下面是我的代码:

    set inputfile [open "input_file.txt" r]
    set outputfile [open "output_file.txt" w]
    
    set first_word ""
    set outputline1 ""
    set filtered ""
    
    while { [gets $inputfile line] != 1} {
     set first_word [lindex [split $line ""] 0]
     set re2 {^set_+?}
     #match any "set_ " command
     if { [regexp $re2 $first_word matched] } {
      #if the "set_ " command is found and the outputline1 is not empty, then it's 
      # the end of the last set_ command
      if {$outputline1 != ""} {
       #do the splitting here and put into the outputfile later on
       regexp -- {/b/{(.+)/}}/b} $outputline8 - filtered
       puts "$filtered:$filtered"
       set outputline1 ""
      }
    
      # grab content if part of set_false_path
      if{ [regexp "set_false_path" $first_word] } {
       # if it's the expected command set, put "command_set" flag on which will be used on 
       # the next elseif
       set command_set 1
       lappend outputline1 $line
       regsub -all {\\\[} $outputline1 "\[" outputline2
       regsub -all {\\\]} $outputline2 "\]" outputline3
       regsub -all {\\\{} $outputline3 "\{" outputline4
       regsub -all {\\\}} $outputline4 "\}" outputline5
       regsub -all {\\\\} $outputline5 "\\" outputline6
       regsub -all {\\ +} $outputline6 " " outputline7
       regsub -all {\s+} $outputline7 " " outputline8
      } else {
       set command_set 0
       # if the line isn't started with set_false_path but it's part of set_false_path command
      } elseif {$command_set} {
       lappend outputline1 $line
       regsub -all {\\\[} $outputline1 "\[" outputline2
       regsub -all {\\\]} $outputline2 "\]" outputline3
       regsub -all {\\\{} $outputline3 "\{" outputline4
       regsub -all {\\\}} $outputline4 "\}" outputline5
       regsub -all {\\\\} $outputline5 "\\" outputline6
       regsub -all {\\ +} $outputline6 " " outputline7
       regsub -all {\s+} $outputline7 " " outputline8
      } else {
      }
     }
    }
    
    puts "outputline:outputline8"
    #do the splitting here and put into the file later on for the last grabbed line!
    
    close $inputfile
    close $outputfile
    

    代码深入讨论:

    • 我注意到在 输出行1 的行重叠后,我会得到带有多个空格和正斜杠的意外输出:set_false_path\ -from\ \[get_ports\ \{AAA/BBB\[1\] \ ..等等。

      此输出包含每个特殊字符(例如\{,空格等)的退格([)。因此,我将许多regsub删除所有这些不必要的补充。最终的联接结果位于 $ outputline8

      $ outputline8:

      的结果
      set_false_path -from [get_ports {AAAcc/BBB/CCC[1] BBB_1/CCC[1] CCC/DDD[1] DDD/EEE EEE/FFF[1] FFF/GGG[1]}] -through [get_pins {GGG/HHH[1] HHH/III[1] XXX/YYY[1] YYY/XXX[1] AAA/ZZZ[1]}]
      set_false_path -from [get_ports {AAA/DDD[2]}]
      
    • 我打算在{}

    • 中的 outputline8 中抓住并拆分别针

    参考:process multiple lines text file to print in single line

    • 这是最后一次更新

      如果输入文件:

      set_false_path -from [get_ports {AAAcc/BBB/CCC[1] BBB_1/CCC[1] DDD/EEE}] -through [get_pins {XXX_1[1]}]
      

      我想要输出文件:

      AAAcc/BBB/CCC[1]
      BBB_1/CCC[1]
      DDD/EEE
      XXX_1[1]
      

    谢谢! 这是最后更新

    注意:我是TCL的新手,本论坛和任何建议都非常感谢!

1 个答案:

答案 0 :(得分:0)

尝试以下脚本。我在代码注释中添加了解释:

set inputfile [open "input_file.txt" r]
set outputfile [open "output_file.txt" w]

# This is a temp variable to store the partial lines
set buffer ""

while { [gets $inputfile line] != -1} {
  # Take previous line and add to current line
  set buffer "$buffer[regsub -- {\\[[:blank:]]*$} $line ""]"

  # If there is no ending \ then stop adding and process the elements to extract
  if {![regexp -- {\\[[:blank:]]*$} $line]} {
    # Skip line if not "set_false_path"
    if {[lindex [split $buffer " "] 0] ne "set_false_path"} {
      set buffer ""
      continue
    }

    # Grab each element with regexp into a list and print each to outputfile
    # m contains whole match, groups contains sub-matches
    foreach {m groups} [regexp -all -inline -- {\{([^\}]+)\}} $buffer] {
      foreach out [split $groups] {
        puts $outputfile $out
      }
    }

    # Clear the temp variable
    set buffer ""
  }
}

close $inputfile
close $outputfile