regex将target =“ _ blank”添加到所有链接,但排除已经具有target =“ _ blank”或具有<a name="..."> and </a> <a href="#...">

时间:2018-08-01 23:33:18

标签: html regex

So I have this regex that I designed, but can't seem to exclude links on a page that already have target="_blank" or links that contain <a name="..."> or <a hre="#..."> How would I exclude links with target="_blank" and not add target="_blank" to anchor links?

Find: <a href=(".*)|([^#][^"]*)\\s>(\w.*)(</a>) Replace: <a href=$1 target="_blank"$2$3

1 个答案:

答案 0 :(得分:0)

正则表达式是notoriously这项工作的错误工具。

HTML是正则表达式无法理解的结构化数据,这意味着您碰到的恰恰是您遇到的问题:对于任何非平凡的问题,HTML结构中允许的许多变体都使得使用解析非常困难。字符串操作技术。

DOM方法旨在处理此类数据,因此请改用它们。以下内容将遍历文档中的每个<a>标记,排除那些没有href属性,href以'#'开头或name属性的对象,并在其余标签上设置'target'属性。

Array.from(document.getElementsByTagName('a')).forEach(function(a) { 
  if (
    a.getAttribute("href") &&
    a.getAttribute("href").indexOf('#') !==0 &&
    a.getAttribute("name") === null
  ) {
    a.setAttribute('target', '_blank'); // on links that already have this attribute this will do nothing 
  }
});

// Just to confirm:
console.log(document.getElementById('container').innerHTML)
<div id="container">
  <a href="http://example.com">test</a>
  <a href="#foo">test2</a>
  <a href="http://example.com" target="_blank">test3</a>
  <a name="foo">test4</a>
</div>