如何在某些特定字符串之前和之后提取字符串?并只提取12位数字以表示滚动编号?
input_file ="my bday is on 04/01/1997 and
frnd bday on 28/12/2018,
account no is A000142116 and
valid for 30 days for me and
for my frnd only 4 DAYS.my roll no is 130302101786
and register number is 1600523941. Admission number is
181212001103"
for line in input_file:
m1 = re.findall(r"[\d]{1,2}/[\d]{1,2}/[\d]{4}", line)
m2 = re.findall(r"A(\d+)", line)
m3 = re.findall(r"(\d+)days", line)
m4 = re.findall(r"(\d+)DAYS", line)
m5 = re.findall(r"(\d+)", line)
m6 = re.findall(r"(\d+)", line)
m7 = re.findall(r"(\d+)", line)
for date_n in m1:
print(date_n)
for account_no in m2:
print(account_no)
for valid_days in m3:
print(valid_days)
for frnd_DAYS in m4:
print(frnd_DAYS)
for roll_no in m5:
print(roll_no)
for register_no in m6:
print(register_no)
for admission_no in m7:
print(admission_no)
预期输出:
04/01/1997
28/12/2018
A000142116
30 days
4 DAYS
130302101786
1600523941
181212001103
答案 0 :(得分:1)
答案 1 :(得分:0)
我会在所有可能的匹配项中使用正则表达式模式,并交替显示:
\d{2}/\d{2}/\d{4}|\d+ days|[A-Z0-9]{10,}
这与日期,数字days
或帐号匹配。对于帐号,我假设长度为10或更大,仅由字母和数字组成。
input_file = """my bday is on 04/01/1997 and
frnd bday on 28/12/2018,
account no is A000142116 and
valid for 30 days for me and
for my frnd only 4 DAYS.my roll no is 130302101786
and register number is 1600523941. Admission number is
181212001103"""
results = re.findall(r'\d{2}/\d{2}/\d{4}|\d+ days|[A-Z0-9]{10,}', input_file, flags=re.IGNORECASE)
print(results)
['04/01/1997', '28/12/2018', 'A000142116', '30 days', '4 DAYS', '130302101786',
'1600523941', '181212001103']