re.sub不会替换为整个字符串

时间:2017-12-28 06:15:23

标签: regex python-3.x

我试图自己解决这个问题,但现在感到很沮丧所以想要联系StackXers。我是一名初学Python开发人员,使用Automate the Boring Stuff udemy课程学习正则表达式。

至于我的问题, 我正在尝试使用正则表达式创建此目标字符串:

target_string = '12 drummers drumming, 11 pipers piping, 10 lords a leaping, 
9 ladies dancing, 8 maids a milking, 7 swans a swimming, 6 geese a laying, 5 
golden rings, 4 calling birds, 3 french hens, 2 turtle doves, and a 
partridge in a pear tree'

原始字符串(从metrolyrics复制)是

original_string = '''12 Drummers Drumming 11 Pipers Piping 10 Lords a 
Leaping 9 Ladies Dancing 8 Maids a Milking 7 Swans a Swimming 6 Geese a 
Laying 5 Golden Rings 4 Calling Birds 3 French Hens 2 Turtle Doves and a 
Partridge in a Pear Tree'''

我的代码如下

import re
strings = '''12 Drummers Drumming 11 Pipers Piping 10 Lords a Leaping 9 
Ladies Dancing 8 Maids a Milking 7 Swans a Swimming 6 Geese a Laying 5 
Golden Rings 4 Calling Birds 3 French Hens 2 Turtle Doves and a Partridge in 
a Pear Tree'''
lyrics = strings.split()
xmasRegex = re.compile(r'\d+\s\D+\s([a-zA-Z]+)')
re.sub(r'\1,',strings)

这只会在最后用逗号返回押韵的单词(无意中包含“Tree”和排除“Doves”),但我试图替换这些单词(包括“Doves”)并将它们放回去使用此方法的字符串,如目标字符串中所示。尽管可以通过for循环和一些修补来实现这一点,但我想以正则表达式方式进行。

我对sub方法和/或regex对象做错了什么?

2 个答案:

答案 0 :(得分:2)

这将在一次传递中复制整个目标字符串,包括and之前的逗号。

In [34]: target_string
Out[34]: '12 drummers drumming, 11 pipers piping, 10 lords a leaping, 9 ladies dancing, 8 maids a milking, 7 swans a swimming, 6 geese a laying, 5 golden rings, 4 calling birds, 3 french hens, 2 turtle doves, and a partridge in a pear tree'

In [35]: original_strings
Out[35]: '12 Drummers Drumming 11 Pipers Piping 10 Lords a Leaping 9 Ladies Dancing 8 Maids a Milking 7 Swans a Swimming 6 Geese a Laying 5 Golden Rings 4 Calling Birds 3 French Hens 2 Turtle Doves and a Partridge in a Pear Tree'

In [36]: replaced_strings = re.sub('(\s\d+|\sand)',r',\1',original_strings).lower()

In [37]: target_string == replaced_strings
Out[37]: True

答案 1 :(得分:1)

你可以分2次通过:

1) 使用此正则表达式检测前面没有任何内容或空格的数字r'((\s|^)\d+)',并使用后向引用替换第一个匹配的组',\1'

https://regex101.com/r/7yxSaj/1/

上测试

2) 使用此正则表达式检测单词的第一个大写字母并将其转换为小写:r'\b([A-Z])'和替换字符串:'\L\1'

https://regex101.com/r/dNuYhG/1/

上测试