Python正则表达式 - 在另一个字符串之前的某个地方找到一个文件中的字符串?

时间:2011-08-19 10:32:15

标签: python regex

我的编程知识非常有限,我真的很感激这个可能明显问题的任何帮助!

假设我有一个文本文件,某处包含文字:“我拥有两个(其中一些文字......)自行车。”

我怎么能改变两到三个?意思是我需要一个函数来找到字符串“bicycle”,然后向左看,直到它找到字符串“two”并更改它。

2 个答案:

答案 0 :(得分:1)

您可以使用正则表达式执行此操作:

>>> import re
>>> s = 'I own two (Some text in between...) bicycles and two dogs.'
>>> re.sub('two(.*bicycles)', 'three\\1', s)
'I own three (Some text in between...) bicycles and two dogs.'

常规字符串函数:

>>> try:
...   p = s.rindex('two', 0, s.index('bicycles'))
...   s[:p] + 'three' + s[p+len('two'):]
... except ValueError:
...   pass # No bicycles or no two
...
'I own three (Some text in between...) bicycles and two dogs.'

答案 1 :(得分:0)

使用正则表达式:

import re

line = '-------------------------------------------------------------\n'

ss = ('I gave two similar things to my two twin sons: '
      'two spankings, two nice BICYCLES of 300 dollars each, '
      'yes 600 dollars for two horridly nice BICYCLES, '
      'two times 300 dollars for two sons receiving two BICYCLES !, '
      'two dollars too, but never two dogs')
print ss,'\n\n'


print line + '1) Replacing the more at right before the first "BICYCLES":\n'
reg = re.compile('two(?=(?:.(?!two))*?BICYCLES)(.+)')
print reg.sub('@@@@\\1',ss)


print line + '2) Replacing the more at right before the last "BICYCLES":\n'
reg = re.compile('two(?=(?:.(?!two))*?BICYCLES(?!.*?BICYCLES))')
print reg.sub('@@@@',ss)


print line + '3) Replacing all before the first "BICYCLES":\n'
reg = re.compile('(two)|BICYCLES.+')
print reg.sub(lambda mat: '@@@@' if mat.group(1) else mat.group(),ss)


print line + '4) Replacing all before the last "BICYCLES":\n'
reg = re.compile('(two)|BICYCLES(?!.*?BICYCLES).+')
print reg.sub(lambda mat: '@@@@' if mat.group(1) else mat.group(),ss)

结果

I gave two similar things to my two twin sons: two spankings, two nice BICYCLES of 300 dollars each, yes 600 dollars for two horridly nice BICYCLES, two times 300 dollars for two sons receiving two BICYCLES !, two dollars too, but never two dogs 


-------------------------------------------------------------
1) Replacing the more at right before the first "BICYCLES":

I gave two similar things to my two twin sons: two spankings, @@@@ nice BICYCLES of 300 dollars each, yes 600 dollars for two horridly nice BICYCLES, two times 300 dollars for two sons receiving two BICYCLES !, two dollars too, but never two dogs
-------------------------------------------------------------
2) Replacing the more at right before the last "BICYCLES":

I gave two similar things to my two twin sons: two spankings, two nice BICYCLES of 300 dollars each, yes 600 dollars for two horridly nice BICYCLES, two times 300 dollars for two sons receiving @@@@ BICYCLES !, two dollars too, but never two dogs
-------------------------------------------------------------
3) Replacing all before the first "BICYCLES":

I gave @@@@ similar things to my @@@@ twin sons: @@@@ spankings, @@@@ nice BICYCLES of 300 dollars each, yes 600 dollars for two horridly nice BICYCLES, two times 300 dollars for two sons receiving two BICYCLES !, two dollars too, but never two dogs
-------------------------------------------------------------
4) Replacing all before the last "BICYCLES":

I gave @@@@ similar things to my @@@@ twin sons: @@@@ spankings, @@@@ nice BICYCLES of 300 dollars each, yes 600 dollars for @@@@ horridly nice BICYCLES, @@@@ times 300 dollars for @@@@ sons receiving @@@@ BICYCLES !, two dollars too, but never two dogs

没有正则表达式也可以:

line = '-------------------------------------------------------------\n'

ss = ('I gave two similar things to my two twin sons: '
      'two spankings, two nice BICYCLES of 300 dollars each, '
      'yes 600 dollars for two horridly nice BICYCLES, '
      'two times 300 dollars for two sons receiving two BICYCLES !, '
      'two dollars too, but never two dogs')
print ss,'\n\n'


print line + '1) Replacing the more at right before the first "BICYCLES":\n'
fb = ss.find('BICYCLES')
print '@@@@'.join(ss[0:fb].rsplit('two',1)) + ss[fb:] if fb+1 else ss


print line + '2) Replacing the more at right before the last "BICYCLES":\n'
fb = ss.rfind('BICYCLES')
print '@@@@'.join(ss[0:fb].rsplit('two',1)) + ss[fb:] if fb+1 else ss


print line + '3) Replacing all before the first "BICYCLES":\n'
fb = ss.find('BICYCLES')
print ss[0:fb].replace('two','@@@@') + ss[fb:] if fb+1 else ss


print line + '4) Replacing all before the last "BICYCLES":\n'
fb = ss.rfind('BICYCLES')
print ss[0:fb].replace('two','@@@@') + ss[fb:] if fb+1 else ss

结果相同

但使用正则表达式可以提供更多可能性:

import re

ss = ('Mr Dotwo bought two gifts for his two sons, two hours ago: two BICYCLES '
      'because his two sons wanted only two BICYCLES')
print ss,'\n\n'


print 'Replacing all "two" before the first "BICYCLES":\n'
reg = re.compile('(\\btwo\\b)|BICYCLES.+')
print reg.sub(lambda mat: '@@@@' if mat.group(1) else mat.group(),ss)

结果

Mr Dotwo bought two gifts for his two sons, two hours ago: two BICYCLES because his two sons wanted only two BICYCLES 


Replacing all strings "two" before the first "BICYCLES":

Mr Dotwo bought @@@@ gifts for his @@@@ sons, @@@@ hours ago: @@@@ BICYCLES because his two sons wanted only two BICYCLES
相关问题