在python中提取子字符串

时间:2015-06-18 11:23:31

标签: python regex string-formatting

我想解析一个字符串来提取大括号中的所有子串:

'The value of x is {x}, and the list is {y} of len {}'

应该产生:

(x, y)

然后我想格式化字符串以使用值打印初始字符串:

str.format('The value of x is {x}, and the list is {y} of len {}', x, y, len(y))

我该怎么做?

Example usage:
def somefunc():
    x = 123
    y = ['a', 'b']
    MyFormat('The value of x is {x}, and the list is {y} of len {}',len(y))

output:
    The value of x is 123, and the list is ['a', 'b'] of len 2

3 个答案:

答案 0 :(得分:6)

您可以使用string.Formatter.parse

  

遍历format_string并返回一个可迭代的元组(literal_text,field_name,format_spec,conversion)。 vformat()使用它将字符串分解为文字文本或替换字段。

     

元组中的值在概念上表示文字文本的范围,后跟单个替换字段。如果没有文字文本(如果连续出现两个替换字段会发生这种情况),则literal_text将是一个零长度字符串。如果没有替换字段,则field_name,format_spec和conversion的值将为None。

from string import Formatter

s = 'The value of x is {x}, and the list is {y} of len {}'

print([t[1] for t in Formatter().parse(s) if t[1]])
['x', 'y']

不确定这对你要做的事情有什么帮助,因为你可以在你的函数中将x和y传递给str.format或使用** locals:

def somefunc():
    x = 123
    y = ['a', 'b']
    print('The value of x is {x}, and the list is {y} of len {}'.format(len(y),**locals()))

如果要打印命名的args,可以添加Formatter输出:

def somefunc():
    x = 123
    y = ['a', 'b']
    print("The named args are {}".format( [t[1] for t in Formatter().parse(s) if t[1]]))
    print('The value of x is {x}, and the list is {y} of len {}'.format(len(y), **locals()))

哪个会输出:

The named args are ['x', 'y']
The value of x is 123, and the list is ['a', 'b'] of len 2

答案 1 :(得分:0)

您可以使用re.findall

>>> import re
>>> s = 'The value of x is {x}, and the list is {y} of len {}'
>>> re.findall(r'\{([^{}]+)\}', s)
['x', 'y']
>>> tuple(re.findall(r'\{([^{}]+)\}', s))
('x', 'y')

答案 2 :(得分:0)

提取值后你在做什么?

import re
st = "The value of x is {x}, and the list is {y} of len {}"
exp = re.compile(r"\{(.+?)\}")

print(tuple(exp.findall(st)))

输出

 ('x', 'y')