将多个XML文件中的某些数据提取到excel中

时间:2016-05-15 17:49:59

标签: xml excel vba

我有超过1000个XML文件,所有文件都有不同的数据,但标签相同

<Title>
    <LocalTitle>12 Years a Slave</LocalTitle>
    <OriginalTitle>12 Years a Slave</OriginalTitle>
    <ProductionYear>2013</ProductionYear>
    <Added>23/05/2015 15:42:03</Added>
    <IMDBrating>8,1</IMDBrating>
    <ContentRating>R</ContentRating>
    <MPAARating>R</MPAARating>
    <IMDB>tt2024544</IMDB>
    <IMDbId>tt2024544</IMDbId>
    <TMDbId>76203</TMDbId>
    <Budget>20000000</Budget>
    <Revenue>187</Revenue>
    <Language>English</Language>
    <LanguageCode>en</LanguageCode>
    <Country>GB</Country>
    <RunningTime>134</RunningTime>
    <Overview>In the antebellum United States,</Overview>
    <Genres>
        <Genre>Biography</Genre>
        <Genre>Drama</Genre>
        <Genre>History</Genre>
    </Genres>
</Title>

我只需要将所有XML文件中的某些标记提取到excel文件中(例如只有LocalTitleProductionYearIMDBRatingIMDBId)我该怎么办?这样做而不花费数小时复制和粘贴?

1 个答案:

答案 0 :(得分:0)

重新考虑XSLT。 Excel VBA的MSXML对象维护着一个XSLT 1.0处理器。此外,Excel可以使用OpenXML方法将转换后的XML直接导入工作簿:

XSLT (外部保存为.xsl以便在下面调用;仅返回所需的标签)

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>

  <xsl:template match="Title">
    <xsl:copy>
      <xsl:apply-templates select="LocalTitle|ProductionYear|IMDBrating|IMDbId"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="*">
    <xsl:copy-of select="."/>
  </xsl:template>

</xsl:transform>

<强> VBA

Public Sub XSLTransform()
On Error GoTo ErrHandle
    ' ADD MSXML v6.0 REFERENCE '
    Dim xmldoc As New MSXML2.DOMDocument60
    Dim xslDoc As New MSXML2.DOMDocument60 
    Dim newDoc As New MSXML2.DOMDocument60
    Dim newwkb As Workbook

    ' LOAD XML AND XSL FILES '
    xslDoc.async = False
    xmldoc.Load "C:\Path\To\Input.xml"

    xslDoc.async = False
    xslDoc.Load "C:\Path\To\XSLTScript.xsl"

    ' TRANSFORM XML '
    xmldoc.transformNodeToObject xslDoc, newDoc
    newDoc.Save "C:\Path\To\Output.xml"

    ' IMPORT TRANSFORMED XML '
    Set newwkb = Workbooks.OpenXML("C:\Path\To\Output.xml", , xlXmlLoadImportToList)

    Exit Sub

ErrHandle:
    MsgBox Err.Number & " - " & Err.Description, vbCritical
    Err.Raise xslDoc.parseError.ErrorCode, , xslDoc.parseError.reason
    Exit Sub

End Sub

XML 输出

<?xml version="1.0" encoding="UTF-8"?>
<Title>
    <LocalTitle>12 Years a Slave</LocalTitle>
    <ProductionYear>2013</ProductionYear>
    <IMDBrating>8,1</IMDBrating>
    <IMDbId>tt2024544</IMDbId>
</Title>