如何使用Xidel从具有@srcset属性的图像中提取所有@srcset宽度字符串?

时间:2019-03-17 15:26:54

标签: html bash srcset xidel

使用Xidel,我需要在包含通用模式的@srcset属性中提取所有图像大小: “ (\d+)w

./xidel "url_with_images" -e '?'

查看此图片示例

<img ... @srcset="https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-696x457.jpg 696w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-220x144.jpg 220w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-300x197.jpg 300w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-768x504.jpg 768w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-475x312.jpg 475w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-741x486.jpg 741w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-640x420.jpg 640w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu.jpg 800w" />

Xidel预期输出:

696w
220w
300w
768w
475w
741w
650w
800w

1 个答案:

答案 0 :(得分:2)

http://www.benibela.de/documentation/internettools/xpath-functions.html#x-extract

  

如果标志包含*,则返回所有匹配项。

cat <<EOF | xidel -s - -e 'extract(//@srcset,"(\d+w)",1,"*")'
<img srcset="https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-696x457.jpg 696w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-220x144.jpg 220w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-300x197.jpg 300w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-768x504.jpg 768w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-475x312.jpg 475w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-741x486.jpg 741w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu-640x420.jpg 640w, https://www.jewishpress.com/wp-content/uploads/Billionaire-Arnon-Milchan-and-PM-Benjamin-Netanyahu.jpg 800w" />
EOF

696w
220w
300w
768w
475w
741w
640w
800w

-e 'tokenize(//@srcset,",") ! substring-after(.,"jpg ")'也可以。