递归地逐步浏览HTML DOM并打印属性

时间:2016-03-19 05:03:28

标签: javascript html dom recursion

我正在尝试递归地遍历DOM以获取HTML文档并打印节点的名称,同时使用缩进来标识子节点。我修改了一个w3schools代码示例,结果是:

<!DOCTYPE html>
<html>
<head>
<title> Test Example </title>
</head>
<body>

<p>Click the button to get the node names of the body element's child nodes.</p>

<button onclick="myFunction(document.documentElement)">Try it</button>

<p><strong>Note:</strong> Whitespace inside elements is considered as #text, and text is considered as nodes.</p>

<!-- My personal comment goes here..  -->

<div><strong>Note:</strong> Comments in the document are considered as #comments, and comments are also considered as nodes.</div>

<p id="demo"></p>

<script>
function myFunction(node) {
indent = "";
txt = node.nodeName + "<br>";
document.write(txt);
var c = node.childNodes;
var i;
for (i = 0; i < c.length; i++) {
    myFunction(c[i], indent);
}    
}

function myFunction(node, indent) {
var indent = indent + "   ";
txt = indent + node.nodeName + "<br>";
document.write(txt);
var c = node.childNodes;
var i;
for (i = 0; i < c.length; i++) {
    txt = c[i].nodeName + "<br>"; 
    myFunction(c[i], indent);
}  
}
</script>

</body>
</html>

我得到的结果是:

undefined HTML
undefined HEAD
undefined #text
undefined TITLE
undefined #text
undefined #text
undefined #text
undefined BODY
undefined #text
undefined P
undefined #text
undefined #text
undefined BUTTON
undefined #text
undefined #text
undefined P
undefined STRONG
undefined #text
undefined #text
undefined #text
undefined #comment
undefined #text
undefined DIV
undefined STRONG
undefined #text
undefined #text
undefined #text
undefined P
undefined #text
undefined SCRIPT
undefined #text
undefined #text

所以我有几个问题 1)为什么不用节点名打印缩进值而是打印undefined? 2)如果我只是要求节点名称,为什么还要打印#text和#comments行?

我是HTML和javascript的新手,所以任何见解都会有所帮助

修改

我解决了第二个问题,但我的第一个问题仍然存在问题。我的脚本现在看起来像

<script>
function myFunction(node) {
    var indent = "";
    txt = node.nodeName + "<br>";
    document.write(txt);
    var c = node.children;
    var i;
    for (i = 0; i < c.length; i++) {
        myFunction(c[i], indent);
    }    
}

function myFunction(node, indent) {
    indent = indent + "    ";
    txt = indent + node.nodeName + "<br>";
    txt = txt.replace(/ /g, '&nbsp;');
    document.write(txt);
    var c = node.children;
    var i;
    for (i = 0; i < c.length; i++) {
        txt = c[i].nodeName + "<br>"; 
        myFunction(c[i], indent);
    }  
}

and my output looks like 

undefined    HTML
undefined        HEAD
undefined            TITLE
undefined        BODY
undefined            P
undefined            BUTTON
undefined            P
undefined                STRONG
undefined            DIV
undefined                STRONG
undefined            P
undefined            SCRIPT

我显然仍然遗漏了一些东西,但我不明白是什么

修改2

感谢下面的帮助,我发现了我的问题,我在第二个myfunction(现在称为myfunction1)的循环中调用了错误的函数

<!DOCTYPE html>
<html>
<head>
<title> Test Example </title>
</head>
<body>

<p>Click the button to get the node names of the body element's child nodes.</p>

<button onclick="myFunction(document.documentElement)">Try it</button>

<p><strong>Note:</strong> Whitespace inside elements is considered as #text, and text is considered as nodes.</p>

<!-- My personal comment goes here..  -->

<div><strong>Note:</strong> Comments in the document are considered as #comments, and comments are also considered as nodes.</div>

<p id="demo"></p>

<script>
function myFunction(node) {
    var indent = "";
    var txt = node.nodeName + "<br>";
    document.write(txt);
    var c = node.children;
    var i;
    for (i = 0; i < c.length; i++) {
        myFunction1(c[i], indent);
    }    
}

function myFunction1(node, indent) {
    indent = indent + "      ";
    txt = indent + node.nodeName + "<br>";
    txt = txt.replace(/ /g, '&nbsp;');
    document.write(txt);
    var c = node.children;
    var i;
    for (i = 0; i < c.length; i++) {
        txt = c[i].nodeName + "<br>"; 
        myFunction1(c[i], indent);
    }  
}
</script>

</body>
</html>

现在输出

HTML
      HEAD
            TITLE
      BODY
            P
            BUTTON
            P
                  STRONG
            DIV
                  STRONG
            P
            SCRIPT

2 个答案:

答案 0 :(得分:1)

  

1)为什么不用节点名打印缩进值而是打印undefined

您有两个相同名称的函数,请确保运行正确的函数并提供正确的参数。从更新的输出中,您可以使用这样的代码在第一次调用时将indent初始化为空字符串。

function myFunction(node, indent) {
    indent = (indent || "") + "    ";
  

2)如果我只询问节点名称,为什么还要打印#text和#comments行?

因为这些节点也是(并且它们有名称),并且您没有在代码中过滤掉它们。如果要排除它们,请考虑仅通过元素递归,而不是使用.children而不是.childNodes的所有DOM节点。

答案 1 :(得分:1)

  

1)为什么不用节点名打印缩进值而是打印undefined?

您有两个具有相同名称的功能,这意味着第二个功能会覆盖第一个功能。因此,您实际上是在调用第二个版本而不是将任何内容传递给indent param。

  

2)如果我只询问节点名称,为什么还要打印#text和#comments行?

每个节点都有.nodeName,因此您将获得每个节点的名称。如果您只需要Elements,请使用if语句,并检查其node.nodeType === 1