如何从DOMDocument获取所有HTML链接?

时间:2016-11-14 12:49:11

标签: php dom xpath

我尝试使用原生DOM扩展程序从文档中获取所有HTML链接:

PublishSubject<String> eventPipe = PublishSubject.create();

        Observable<String> pipe = eventPipe.observeOn(Schedulers.computation()).asObservable();
        // susbcribe to that source
        Subscription s =  pipe.subscribe(value -> Log.i(LOG_TAG, "Value received: " + value));

        // give next value to source (use it from onEvent())
        eventPipe.onNext("123");

        // stop receiving events (when you disconnect from Service)
        if (s != null && !s.isUnsubscribed()){
            s.unsubscribe();
            s = null;
        }

        // we're disconnected, nothing will be printed out
        eventPipe.onNext("321");

HTML代码是:

$items = $xpath->query('//div[@class="cards"]/div[@class="card"]/div/a[@class="card-click-target"]');

但它给了我一个空的对象。如何正确地做到这一点?

1 个答案:

答案 0 :(得分:2)

如果要获取具有a属性的href个节点,请使用//a[@href] XPath表达式,例如:

$r = $xpath->evaluate('//a[@href]');
foreach ($r as $n) {
  printf("%s: %s\n", $n->textContent, $n->getAttribute('href'));
}

示例输出

Link: http://domain.com/page

但是,如果您想要href属性值,请使用//a/@href选择器:

$r = $xpath->evaluate('//a/@href');
foreach ($r as $n) {
  var_dump($n->value);
}

获取aclass属性值等于card-click-target的所有$r = $xpath->evaluate('//a[@class = "card-click-target" and @href]'); foreach ($r as $n) { printf("%s: %s\n", $n->textContent, $n->getAttribute('href')); }; 代码的示例:

Array.Sort(yourCollection);