从字符串中删除多余的

时间:2013-12-13 14:41:57

标签: javascript regex

我想从字符串中删除不必要的字符串:<br><br/><br />,其中可能包含许多行。例如:

This is a string with <br><br/> many <br/><br> newlines. <br><br><br /><br> But it's ok <br> to have one!

我希望它成为:

This is a string with <br> many <br> newlines. <br> But it's ok <br> to have one!

2 个答案:

答案 0 :(得分:3)

你可以做到

str = str.replace(/<\/?br\s*\/?>\s*(<\/?br\s*\/?>)+/ig, '<br>');

答案 1 :(得分:2)

var str = "This is a string with <br><br/> many <br/><br> newlines. <br><br><br /><br> But it's ok <br> to have one!";
var cleaned = str.replace(/(<br\s?\/?>\s?)+/g,"<br/> ");

现在是坏消息,将HTML标记与正则表达式匹配真的是一个坏主意。请参阅此问题:RegEx match open tags except XHTML self-contained tags

另一种方法是将其设置为元素,循环遍历节点并删除标记。

快速而粗略的代码,没有对此进行大量测试

function removeDupeBrs (elem) {
    var childrenNodes = elem.childNodes;
    for (var i = childrenNodes.length-1; i>=0 ; i--) {
        var cn = childrenNodes[i];
        if (cn.nodeType===1 && cn.tagName==="BR") {
            var next = childrenNodes[i-1];
            if (next && next.nodeType===1 && next.tagName==="BR") {
                elem.removeChild(cn);                
            }
        }
    }    
}

var str = "This is a string with <br><br/> many <br/><br> newlines. <br><br><br /><br> But it's ok <br> to have one!";
var div = document.createElement("div");
div.innerHTML = str;
removeDupeBrs(div);
console.log(div.innerHTML);