正则表达式与逻辑或

时间:2016-01-23 10:18:51

标签: python regex

我怎样才能做一个'或'在正则表达式。我读到我需要简单地将各种表达式放入括号中,但是当我尝试在&Total;' Total:'之后获得任何输出时,以下findall不起作用。或者'价格为1晚@'。

p = re.findall(r'(Total: (.*))(Price for 1 night: (.*))',s)

提供更多背景信息:

prices1=[]

soup = bs(content, 'lxml')
s=soup.prettify()
p = re.findall(r'(Total: (.*))|(Price for 1 night: (.*))',s)
for x in p:
    if '£' in x:
        num=int(x.replace('£',''))
        prices1.append(num)

来源:

http://www.booking.com/searchresults.en-gb.html?label=gen173nr-17CAEoggJCAlhYSDNiBW5vcmVmaFCIAQGYAS64AQTIAQTYAQHoAQH4AQs&sid=1a43e0952558ac0ad0061d5b6523a7bc&dcid=1&checkin_monthday=23;checkin_year_month=2016-1;checkout_monthday=24;checkout_year_month=2016-1;&city=-2601889&class_interval=1&csflt=%7B%7D&dtdisc=0&group_adults=7&group_children=0&highlighted_hotels=1192837&hlrd=0&hp_sbox=1&hyb_red=0&inac=0&label_click=undef&nflt=ht_id%3D201%3B&nha_red=0&no_rooms=1&redirected_from_city=0&redirected_from_landmark=0&redirected_from_region=0&review_score_group=empty&room1=A%2CA%2CA%2CA%2CA%2CA%2CA&sb_price_type=total&score_min=0&si=ai%2Cco%2Cci%2Cre%2Cdi&ss=London&ss_all=0&ssafas=1&ssb=empty&sshis=0&ssne=London&ssne_untouched=London&order=price_for_two

示例值:

<strong class="price scarcity_color sr_gs_rackrate_price
 anim_rack_rate  
" title="Price for 1 night £69">
<b>
<span class="sr_gs_rackrate_total">Total: </span>
£69
</b>
</strong>
<td class="totalPrice" colspan="3">
<div data-component="track" data-hash="OLNYSRfCbdWGffSRe" data-stage="1" data-track="view"></div>
Total: £145
</td>

1 个答案:

答案 0 :(得分:1)

首先,您应该清理输入,删除每个带有替换的HTML标记以及此正则表达式</?[^>]*>

然后你会有Total: £69 Total: £145之类的东西。由于您不希望与实际价格匹配£69 Total: £145,因此您必须将.更改为[^\s](除了空白之外的任何内容)。

然后您只需在条件之间添加|

Total: ([^\s]*)|Price for 1 night: ([^\s]*)

Live Demo (updated)