调用包含Java承诺的递归函数

时间:2018-09-04 10:11:44

标签: javascript parsing recursion promise cheerio

我正在尝试通过解析xml文件来构建json树。这些文件可能包含对其他xml文件的引用。我要解析的所有文件的名称都像toc\d.js。输出的树应采用以下形式:

{
  name: 'name of element', 
  url: 'xml_referenced.xml',
  children: [
  {
     name: '.....',
     url: '.....',
     children: [...]
   }
}

应产生此结果的xml可能类似于以下内容(toc.xml)

<?xml version=\"1.0\" encoding=\"utf-8\" ?><data  src=\"toc.js\"  name=\"Using and Customizing the Application\" url=\"DA_UsingAndCustomizing.htm\"><item name=\"Adapted user interface\" url=\"DA_AdaptedUserInterface.htm\" /><item name=\"Show or hide the windows\" url=\"3402556939.htm\" /><book  src=\"toc2.js\"  name=\"Work with layouts\" url=\"9007202657330059.htm\" /><book  src=\"toc3.js\"  name=\"Adjust table views\" url=\"3402653835.htm\" /><item name=\"Use the keyboard to access the ribbon\" url=\"9007202657380875.htm\" /><item name=\"Keyboard shortcuts\" url=\"27021601196225675.htm\" /><item name=\"Lock or unlock the Data Analysis session\" url=\"27021601166795787.htm\" /><item name=\"Reset all user settings\" url=\"3402736267.htm\" /><item name=\"Find status information\" url=\"9007203112007179.htm\" /><item name=\"Navigation pane\" url=\"18014401941480331.htm\" /><item name=\"PDF Viewer\" url=\"OL_PDFViewer.htm\" /><item name=\"Review mode\" url=\"DA_ReviewMode.htm\" /><item name=\"Customize reports and results\" url=\"DA_CustomizeReportsAndResults.htm\" /><book  src=\"toc4.js\"  name=\"Interfaces\" url=\"DA_Interfaces.htm\" /></data>"

如您所见,它包含引用其他“ toc”文件的元素(将它们转换为xml,因为它们存储为js):

<book  src=\"toc2.js\"  name=\"Work with layouts\" url=\"9007202657330059.htm\" />

我用来解析的函数如下:

var loadedPaths = []
var buildTOC = function(xml, srcPath){
    const parseToc = function(toc){
        var obj = {}
        var children
        if (toc.children.length){
            children = toc.children   // THESE ITEMS ARE INCLUDED IN THE RESULT
        }
        else {
            children = []
        }
        var path = toc.attribs.src
        if (path && loadedPaths.indexOf(path)<0){
            loadedPaths.push(path)
            lib.getXml(srcPath + '/' + toc.attribs.src).then(x => { // RETURNS XML
                children = lib.buildTOC(x, srcPath)  // THESE ITEMS ARE NOT INCLUDED 

            })
        }
        else {
            obj.url = toc.attribs.url
            obj.name = toc.attribs.name
            obj.children = children.map(x => {return parseToc(x)})
        }
        return obj
    }
    var $ = this.buildDom(xml, {xmlMode: true})  // RETURNS A CHEERIO DOM
    console.log([parseToc($('data')[0])])
    return [parseToc($('data')[0])]


}

结果仅包含原始toc.xml文件中的<item><book>元素。我期望<book>元素也包含子元素,这些子元素是toc2.js,toc3.js等文件中<data>标签的子元素。

有人可以帮我弄清楚这里出了什么问题吗?谢谢。

1 个答案:

答案 0 :(得分:0)

这是您如何通过诺言来做这种事情:

var buildTOC = function(xml, srcPath){
    const parseToc = function(toc){
        const children = ...;//create the children url list here
        const childrenToc = children.length===0 ? Promise.resolve([]) : Promise.all(children.map(childUrl => load(childUrl))).then(childXml => buildTOC(childXml));
        //here, load would be your lib.getXml function
        return childrenToc.then(childrenTocsList => ({
            url: toc.attribs.url,
            name: toc.attribs.name,
            children: childrenTocsList
        }));
    }
    var $ = this.buildDom(xml, {xmlMode: true})  // RETURNS A CHEERIO DOM
    return parseToc($('data')[0]);
}

//Usage:
buildTOC(myXML, mySrcPath).then(result => console.log(result));

您只需要创建children列表和load函数即可满足您的需求。