Python - 将字符串中的单词与字符串列表进行匹配

时间:2015-05-06 07:09:25

标签: python string list

我是python的新手,我想知道如何完成字符串比较

假设我有一个包含状态名称的字符串列表,如

states = ["New York", "California", "Nebraska", "Idaho"]

我还有另一个包含地址的字符串,如

postal_addr = "1234 1st E St San Jose California 95112"

如何解析此地址字符串并找到状态列表中项目的匹配项?在上面的例子中,加利福尼亚将是一场比赛。然后,我如何在匹配后提取"California"并将其存储为单独的字符串?

6 个答案:

答案 0 :(得分:1)

>>> states = ["New York", "California", "Nebraska", "Idaho"]
>>> postal_addr = "1234 1st E St San Jose California 95112"
>>> first_match = next(state for state in states if state in postal_addr)
>>> first_match
'California'

但是,如果您需要匹配字边界,最好使用正则表达式。

答案 1 :(得分:1)

我愿意

matches = [ s for s in states if s in postal_addr ]

然后,如果你想从邮政地址获取字符串:

import re
if matches:
    extracted = re.findall( matches[0],  postal_addr)[0]

编辑:..但这不适用于城市名称包含不同州的城市/州组合,例如postal_adr = '1 Arrowhead Dr, Kansas City, Missouri 64129'states = ["New York", "California", "Nebraska", "Idaho", "Missouri", "Kansas"]等。在这种情况下

import re
if matches:
    extracted = [(re.search(m, postal_addr).start() , m) for m in matches ]
    extracted = sorted( extracted )[-1][1]

答案 2 :(得分:0)

states = ["New York", "California", "Nebraska", "Idaho"]
postal_addr = "1234 1st E St San Jose California 95112"

result = None
for state in states:
    if state in postal_addr:
        result = state

print(result)

不幸的是,这也会匹配包含Idahoba等州名的单词。

答案 3 :(得分:0)

这是使用正则表达式的另一个替代答案:

import re

states = ["New York", "California", "Nebraska", "Idaho"]
pattern = re.compile(r'.*(' + r'|'.join(states) + ').*')

postal_addr = "1234 1st E St San Jose California 95112"
match = pattern.match(postal_addr)

if match:
    state = match.group(1)

答案 4 :(得分:0)

你可以这样试试,

DataEvent

答案 5 :(得分:-1)

要查找字符串中的所有匹配项,您可以执行以下操作:

matches = [m for m in postal_addr.split() if m in states]