从ISBN和Shell Scripting获取书籍详细信息

时间:2015-10-09 17:19:51

标签: bash shell sed grep

假设我有一个ISBN值列表:

9781887902694
9780072227109
9780672323843
9780782121797
9781565924031
9780735713338
9780735713338
...

我如何使用shell脚本/ bash来检索标题,发布日期,作者和发布者(来自bookfinder4u.com这样的网站)?我很吵,所以我不知道如何继续。

2 个答案:

答案 0 :(得分:1)

#!/bin/bash
if [ -z "$1" ] ; then echo "Usage: $0 <ISBN number>" ; exit 1 ; fi
curl -sL 'http://www.bookfinder4u.com/IsbnSearch.aspx?isbn='$1'&mode=direct'

那会得到你的页面,但用grep和sed解析那个响应看起来真的很乱。如果您知道将返回JSON或XML的API,则会更容易。

答案 1 :(得分:0)

如果您能够运行php,则可以使用:

<强> bookDetails.php

<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
//you may want to un-comment the code below to make sure your script doesn't timeout
//set_time_limit(0); 
//ignore_user_abort(true);
libxml_use_internal_errors(true);

//get all the isbn numbers from a txt file
$isbns = file("isbn.txt", FILE_IGNORE_NEW_LINES);

//lopp all the isbn's
foreach($isbns as $isbn){

    $html = file_get_contents("http://www.bookfinder4u.com/IsbnSearch.aspx?isbn=$isbn");
    $dom = new DomDocument();
    $dom->loadHtml($html);
    $xpath = new DomXpath($dom);

    $title =  $xpath->query('//*[@class="t9"]')->item(1)->nodeValue;
    $author =  $xpath->query('//*[@class="t9"]')->item(2)->nodeValue;
    $pubisherFormat =  $xpath->query('//*[@id="format_pub_listprice"]')->item(0)->c14n();
    $matches = preg_split('%</br>%', $pubisherFormat);
    $publisher = strip_tags($matches[0]);
    $format = strip_tags($matches[1]);
    $price =  $xpath->query('//*[@class="t8"]')->item(1)->nodeValue;
    preg_match_all('/List price:\s*?(.*?[\d\.]+)/', $price, $price, PREG_PATTERN_ORDER);
    $price = $price[1][0];

    echo $title."\n";
    echo $author."\n";
    echo $publisher."\n";
    echo $format."\n";
    echo $price."\n\n";

}

假设isbn.txt包含

9781887902694
9780072227109
9780672323843

输出将是:

Javascript: Concepts & Techniques; Programming Interactive Web Sites
By: Tina Spain McDuffie
Publisher: Franklin Beedle & Assoc - 2003-01
Format: Paperback
EUR 48.32

J2ME: The Complete Reference
By: James Keogh
Publisher: McGraw-Hill - 2003-02-27
Format: Paperback
EUR 57.94

Sams Teach Yourself J2ee in 21 Days with CDROM (Sams Teach Yourself...in 21 Days)
By: Martin Bond Dan Haywood Peter Roxburgh
Publisher: Sams - 2002-04
Format: Paperback
EUR 43.92

shell

开始
php bookDetails.php