Question

我发现了一个正则表达式，它可以捕获URL，但它不会捕获某些URL。

$("#links").change(function() {

    //var matches = new array();
    var linksStr = $("#links").val();
    var pattern = new RegExp("^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$","g");
    var matches = linksStr.match(pattern);

    for(var i = 0; i < matches.length; i++) {
      alert(matches[i]);
    }

})

它不会捕获此网址（我需要它）：

http://www.wupload.com/file/63075291/LlMlTL355-EN6-SU8S.rar

但它抓住了这个

http://www.wupload.com

Answer 1

有几件事：

它不起作用的主要原因是，在将字符串传递给RegExp()时，需要对斜杠进行斜杠处理。所以这个：

"^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$"

应该是：

"^(https?:\/\/)?([\\da-z\\.-]+)\\.([a-z\\.]{2,6})([\/\\w \\.-]*)*\/?$"

接下来，你说FF报道，“正则表达式过于复杂”。这表明linksStr是几行网址候选者因此，您还需要将m标记传递给RegExp()。
现有的正则表达式阻止了合法的值，例如：“HTTP://STACKOVERFLOW.COM”。因此，请将i标记与RegExp()一起使用。
空格一直在进入，特别是在多行值中。使用前导\s*和$.trim()来处理它。
不允许使用相对链接，例如/file/63075291/LlMlTL355-EN6-SU8S.rar？

全部放在一起（第5项除外），它变为：

var linksStr    = "http://www.wupload.com/file/63075291/LlMlTL355-EN6-SU8S.rar  \n"
                + "  http://XXXupload.co.uk/fun.exe \n "
                + " WWW.Yupload.mil ";
var pattern     = new RegExp (
                    "^\\s*(https?:\/\/)?([\\da-z\\.-]+)\\.([a-z\\.]{2,6})([\/\\w \\.-]*)*\/?$"
                    , "img"
                );

var matches     = linksStr.match(pattern);
for (var J = 0, L = matches.length;  J < L;  J++) {
    console.log ( $.trim (matches[J]) );
}

哪个收益率：

http://www.wupload.com/file/63075291/LlMlTL355-EN6-SU8S.rar
http://XXXupload.co.uk/fun.exe
WWW.Yupload.mil

Answer 2

为什么不做： URLS = str.match（/ https？：[^ \ s] + / ig）;

Answer 3

(https?\:\/\/)([a-z\/\.0-9A-Z_-\%\&\=]*)

这将在文本中找到任何网址

从字符串中提取URL

3 个答案: