在Ruby中的url中提取url的好方法是什么?

时间:2014-04-18 18:48:23

标签: ruby regex

给定url = 'http://www.foo.com/bar?u=http://example.com/yyy/zzz.jpg&aaa=bbb&ccc=ddd'

提取http://example.com/yyy/zzz.jpg的好方法是什么?

编辑: 我想提取第二个网址。

3 个答案:

答案 0 :(得分:3)

我会这样做: -

require 'uri'

url = 'http://www.foo.com/bar?u=http://example.com/yyy/zzz.jpg&aaa=bbb&ccc=ddd'

uri = URI(url)
URI.decode_www_form(uri.query).select { |_,b| b[/^http(s)?/] }.map(&:last)
# => ["http://example.com/yyy/zzz.jpg"]
# or something like
Hash[URI.decode_www_form(uri.query)]['u'] # => "http://example.com/yyy/zzz.jpg"

答案 1 :(得分:3)

require "uri"
URI.extract("text here http://foo.example.org/bla and here mailto:test@example.com and here also.")
# => ["http://foo.example.org/bla", "mailto:test@example.com"]

http://www.ruby-doc.org/stdlib-2.1.1/libdoc/uri/rdoc/URI.html

答案 2 :(得分:0)

使用Ruby 2.0 +:

require 'uri'

url = 'http://www.foo.com/bar?u=http://example.com/yyy/zzz.jpg&aaa=bbb&ccc=ddd'
uri = URI.parse(url)
URI.decode_www_form(uri.query).to_h['u'] # => "http://example.com/yyy/zzz.jpg"

对于Ruby< 2.0:

require 'uri'

url = 'http://www.foo.com/bar?u=http://example.com/yyy/zzz.jpg&aaa=bbb&ccc=ddd'
uri = URI.parse(url)
Hash[URI.decode_www_form(uri.query)]['u'] # => "http://example.com/yyy/zzz.jpg"

Addressable gem的功能非常全面,并且比spec更符合规范。使用以下方法可以完成同样的事情:

require 'addressable/uri'

url = 'http://www.foo.com/bar?u=http://example.com/yyy/zzz.jpg&aaa=bbb&ccc=ddd'
uri = Addressable::URI.parse(url)
uri.query_values['u'] # => "http://example.com/yyy/zzz.jpg"