从网址获取经度和纬度

时间:2018-10-31 13:09:10

标签: web-scraping scrapy screen-scraping

使用xpath,我可以获取包含纬度和经度的网址,但是我需要通过以下方式分别显示这些值:

纬度= -34.552654847695510 经度= -58.457549057672110

<div class="article-map" id="article-map">
<img id="static-map" src="//maps.google.com/maps/api/staticmap?center=-34.552654847695510,-58.457549057672110&amp;zoom=16&amp;markers=-34.552654847695510,-58.457549057672110&amp;channel=ZP&amp;size=780x456&amp;sensor=true&amp;scale=2&amp;key=AIzaSyDuxqN04nAj6aHygffqUpehsbMFbxEZX90&amp;signature=W-cOkT98ssMPpXbZbU3jil5xNes=" class="static-map">
</div>    


response.xpath ('// div [@ id = "article-map"] / img'). extract ()

['<img id = "static-map" src = "// maps.google.com/maps/api/staticmap?center=-34.552654847695510,-58.457549057672110&amp;zoom=16&amp;markers=-34.552654847695510,-58.457549057672110&amp; channel = ZP & amp; size = 780x456 & amp; sensor = true & amp; scale = 2 & amp; key = AIzaSyDuxqN04nAj6aHygffqUpehsbMFbxEZX90 & signature = W-cOkT98ssMPpXbZbU3jil5xNes = "class =" static-map "> ']

2 个答案:

答案 0 :(得分:0)

尝试一下,例如:response.css('#article-map img::attr(src)').re(r'markers=([-\d\.]+),([-\d\.]+)')

或 一种。获取类似response.css('#article-map img::attr(src)').get()的网址 b。通过markers提取centerfrom w3lib.url import url_query_parameter参数,然后应用正则表达式。

但是第一个变体看起来更短,更容易。

答案 1 :(得分:0)

使用网址解析模块方便且准确:

from urllib.parse import urlparse, parse_qs

img_url_string = Selector(text=body).xpath('//img[@id="static-map"]/@src').extract_first()

url_data = urlparse(img_url_string, scheme='https')

qs = url_data.query
parse_qs(qs)['center']
# output ['-34.552654847695510,-58.457549057672110']