Question

我需要从url .say中的file.xml下载xml内容，例如这是url http://www.pistonheads.co.uk/xml/news091.asp?c=26我想将它的xml内容提取到file.xml

<?xml version="1.0" encoding="ISO-8859-1"?>
<rss version="0.91">
<channel>
<title>PistonHeads (Motoring News)</title>
<link>http://www.pistonheads.com/news/</link>
<description>Motoring News</description>

<item>
<title>Bowler Nemesis Joins Spyker At CPP</title>
<description>Plans confired for Nemesis EXR road car to be built in Coventry</description>
</item>
</channel>
</rss>

我试过wget“url”-o file.xml ...当我打开file.xml时......它只是返回

http://www.pistonheads.co.uk/xml/news091.asp?c=26   =＆GT; `news091.asp？c = 26'解决www.pistonheads.co.uk ......完成。连接到www.pistonheads.co.ukhttp：//xx.xxx.xxx.xx已连接。   发送HTTP请求，等待响应... 200 OK长度：5,016 text / xml

0K .... 100％445.31 KB / s
     13:37:13（445.31 KB / s） - “news091.asp？c = 26”已保存5016/5016

还有其他方法可以解决这个问题吗？

Answer 1

如果您想将此作为输出：

PistonHeads (Motoring News) http://www.pistonheads.com/news/ Motoring News

然后这将解决问题：

wget -q -O - http://www.pistonheads.co.uk/xml/news091.asp?c=26 \
  | egrep '(title>|link>|description>)' | head -3 \
  | sed -e 's/.*>\([^>]*\)<.*/\1/' | tr '\n' ' '

但是，如果您只想将链接的输出写入文件，请使用：

wget -O file.xml http://www.pistonheads.co.uk/xml/news091.asp?c=2

注意写入文件选项的大写字母O。

使用shell脚本从URL中提取xml数据/内容

1 个答案: