如何在TCL regexp中只替换前n个匹配的实例?

时间:2015-05-26 06:21:18

标签: regex tcl

我需要将前50个abc替换为bcd。我已尝试过以下内容,但它无效。

set a "1 abc 2 abc 3 abc 4 abc......... 100 abc"
regsub -all "(.*?(abc).*)(50)" $a "bcd \1" b
puts $b

字符串中的数字仅供演示之用。字符串可以是任意的:

set a "hh abc cc abc hh abc cc abc dd abc hh abc......... hh abc"

2 个答案:

答案 0 :(得分:2)

您可以使用此自定义过程使用函数替换:

set a "1 abc 2 abc 3 abc 4 abc......... 100 abc"

proc rangeSub {a first last string sub} {
  # This variable keeps the count of matches
  set count 0
  proc re_sub {str first last rep} {
    upvar count count
    incr count
    # If match number within wanted range, replace with rep, else return original string
    if {$count >= $first && $count <= $last} {
      return $rep
    } else {
      return $str
    }
  }

  set cmd {[re_sub "\0" $first $last $sub]}
  set b [subst [regsub -all "\\y$string\\y" $a $cmd]]

  return $b
}

# Here replacing the 1st to 3rd occurrences of abc
puts [rangeSub $a 1 3 "abc" "bcd"]
# => 1 bcd 2 bcd 3 bcd 4 abc......... 100 abc
puts [rangeSub $a 2 3 "abc" "bcd"]
# => 1 abc 2 bcd 3 bcd 4 abc......... 100 abc

将呼叫更改为rangeSub $a 1 50 "abc" "bcd"以替换前50次。

codepad demo

备选使用索引和string range

set a "1 abc 2 abc 3 abc 4 abc......... 100 abc"

proc rangeSub {a first last string sub} {
  set idx [regexp -all -inline -indices "\\yabc\\y" $a]
  set start [lindex $idx $first-1 0]
  set end [lindex $idx $last-1 1]
  regsub -all -- "\\yabc\\y" [string range $a $start $end] bcd result
  return [string range $a 0 $start-1]$result[string range $a $end+1 end]
}

puts [rangeSub $a 1 3 abc bcd]

答案 1 :(得分:2)

我希望可以同时使用regexpregsub来完成此操作。

%
% set count 0
0
% # Don't bother about this 'for' loop. It is just for input generation
% for { set i 65} {$i < 123} {incr i} {
        if {$count == 101} break
        if { $i >= 90 && $i <=96} {
                continue
        }
        for { set j 65 } {$j < 123} {incr j} {
                if {$count == 101} break
                if { $j >= 90 && $j <=96} {
                        continue
                }
                incr count
                append input "[format %c%c $i $j] abc "


        }
}
%
% # Following the 'input' value taken for processing
% # So, concentrate only from now on wards :D
% set input
AA abc AB abc AC abc AD abc AE abc AF abc AG abc AH abc AI abc AJ abc AK abc AL abc AM abc AN abc AO abc AP abc AQ abc AR abc AS abc AT abc AU abc AV abc AW abc AX abc AY abc Aa abc Ab abc Ac abc Ad abc Ae abc Af abc Ag abc Ah abc Ai abc Aj abc Ak abc Al abc Am abc An abc Ao abc Ap abc Aq abc Ar abc As abc At abc Au abc Av abc Aw abc Ax abc Ay abc Az abc BA abc BB abc BC abc BD abc BE abc BF abc BG abc BH abc BI abc BJ abc BK abc BL abc BM abc BN abc BO abc BP abc BQ abc BR abc BS abc BT abc BU abc BV abc BW abc BX abc BY abc Ba abc Bb abc Bc abc Bd abc Be abc Bf abc Bg abc Bh abc Bi abc Bj abc Bk abc Bl abc Bm abc Bn abc Bo abc Bp abc Bq abc Br abc Bs abc Bt abc Bu abc Bv abc Bw abc Bx abc By abc
%
% regexp "(.*?abc.*?){50}" $input match; #First matching upto '50' occurence
1
% regsub -all "((.*?)abc.*?)" $match "\\2bcd" replaceText; #Replacing the 'abc' with 'bcd'
50
% set replaceText
% regsub $match $input $replaceText output; #At last, replace this content from the main input
1
% 
% set output
AA bcd AB bcd AC bcd AD bcd AE bcd AF bcd AG bcd AH bcd AI bcd AJ bcd AK bcd AL bcd AM bcd AN bcd AO bcd AP bcd AQ bcd AR bcd AS bcd AT bcd AU bcd AV bcd AW bcd AX bcd AY bcd Aa bcd Ab bcd Ac bcd Ad bcd Ae bcd Af bcd Ag bcd Ah bcd Ai bcd Aj bcd Ak bcd Al bcd Am bcd An bcd Ao bcd Ap bcd Aq bcd Ar bcd As bcd At bcd Au bcd Av bcd Aw bcd Ax bcd Ay bcd Az abc BA abc BB abc BC abc BD abc BE abc BF abc BG abc BH abc BI abc BJ abc BK abc BL abc BM abc BN abc BO abc BP abc BQ abc BR abc BS abc BT abc BU abc BV abc BW abc BX abc BY abc Ba abc Bb abc Bc abc Bd abc Be abc Bf abc Bg abc Bh abc Bi abc Bj abc Bk abc Bl abc Bm abc Bn abc Bo abc Bp abc Bq abc Br abc Bs abc Bt abc Bu abc Bv abc Bw abc Bx abc By abc

注意:我注意到您使用\1代表第一个捕获组。但是,你在双引号内使用它是错误的。如果您使用内部大括号,它应该没问题,但在双引号内使用时,反斜杠应该像\\1

一样进行转义