How to parse complex and recurssive xml file having size 1GB and store it in csv using xslt

时间:2017-06-19 14:03:36

标签: java xml xslt

I have the sample xml data as shown below,

<?xml version="1.0" encoding="ISO-8859-1"?>
 <FIXML s="2012-04-23" v="FIX.5.0SP2">
  <Batch ID="...">
   <MktDef MktID="XEUR" MktSegID="14" EfctvBizDt="2017-05-11" NxtEfctvBizDt="2017-05-15" MktSeg="CONF" MarketSegmentDesc="FUT 8-13 Y. SWISS GOV.BONDS 6%" Sym="CH0002741988" ParentMktSegmID="FBND" Ccy="CHF" MktSegStat="1" USFirmFlag="Y" PartID="2">
    <BaseTrdgRules QtSideInd="1" FastMktPctg="0">
    .
    .
    </BaseTrdgRules>
  </MktDef>

  <SecDef TxnTm="2016-12-09T07:29:08.483638853">
      <MktSegGrp MktSegID="14">
        <SecTrdgRules>
          <BaseTrdgRules ImpldMktInd="3" MlegModel="0"/>
        </SecTrdgRules>
     </MktSegGrp>
 </SecDef>
 <SecDef>
   <MktSegGrp MktSegID="14">
    <SecTrdgRules>
      <BaseTrdgRules ImpldMktInd="3" MlegModel="0"/>
    </SecTrdgRules>
  </MktSegGrp>
 </SecDef>
 <SecDef>
  <MktSegGrp MktSegID="14">
   <SecTrdgRules>
     <BaseTrdgRules ImpldMktInd="3" MlegModel="0"/>
   </SecTrdgRules>
 </MktSegGrp>
</SecDef>

<MktDef MktID="XEUR" MktSegID="19629" EfctvBizDt="2017-05-11" NxtEfctvBizDt="2017-05-15" MktSeg="FBON" MarketSegmentDesc="EURO BONO FUTURE 8,5-10,5 YEAR" Sym="DE000A163W29" ParentMktSegmID="FBND" Ccy="EUR" MktSegStat="1" USFirmFlag="Y" PartID="2">
     <BaseTrdgRules QtSideInd="1" FastMktPctg="0">

     </BaseTrdgRules>
</MktDef>
 .
 .
 .
 .
 .
</Batch>
</FIXML>

This is my sample XSLT...

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" >

  <xsl:output method="text"/>

  <xsl:template match="/">
    <xsl:text>MktID,MktSegID,TxnTm,PriSetPx,QtSideInd,FastMktPctg,ImpldMktInd,MlegModel</xsl:text>
    <xsl:text>&#xA;</xsl:text>
    <xsl:for-each select="FIXML/Batch">
      <xsl:variable name="mktDef" select="concat(/Batch/MktDef/@MktID,',',/Batch/MktDef/@MktSegID,',',/Batch/SecDef/@TxnTm,',',/Batch/SecDef/@PriSetPx)" />
      <xsl:choose>
        <xsl:when test="Batch">
          <xsl:for-each select="Batch">
            <xsl:value-of select="concat($mktDef, ',',/Batch/MktDef/BaseTrdgRules/@QtSideInd,',',/Batch/MktDef/BaseTrdgRules/@FastMktPctg,',',/Batch/SecDef/MktSegGrp/SecTrdgRules/BaseTrdgRules/@ImpldMktInd,',',/Batch/SecDef/MktSegGrp/SecTrdgRules/BaseTrdgRules/@MlegModel,'&#xA;')"/>    
          </xsl:for-each>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="concat($mktDef, ',,,,,&#xA;')"/>    
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

I want to get the attributes data of "BaseTrdgRules" in both MktDef and SecDef as below,

MktID   MktSegID    TxnTm   PriSetPx    QtSideInd   FastMktPctg ImpldMktInd MlegModel
XEUR    14  158.39  2016-12-09T07:29:08.483638853               
XEUR    14  158.39  2016-12-09T07:29:08.483638853   3   0       
XEUR    14  158.39  2016-12-09T07:29:08.483638853   3   0   

I have written the code in DOM and able to parse the xml. But the problem is memory issue so I have to develop this with something new parser which it can parse large XML files.

Could you please help me on this. Thanks in Advance!

1 个答案:

答案 0 :(得分:1)

我无法理解您的逻辑,但我认为您可以通过使用密钥在SecDef元素中使用MktSegGrp属性值来查看

<xsl:key name="MktSeg" match="SecDef" use="MktSegGrp/@MktSegID" />

因此,对于给定的MktDef,您将获得SecDef元素,就像这样

<xsl:variable name="secDef" select="key('MktSeg', @MktSegID)" />

试试这个XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" >

  <xsl:output method="text"/>

  <xsl:key name="MktSeg" match="SecDef" use="MktSegGrp/@MktSegID" />

  <xsl:template match="/">
    <xsl:text>MktID,MktSegID,TxnTm,PriSetPx,QtSideInd,FastMktPctg,ImpldMktInd,MlegModel</xsl:text>
    <xsl:text>&#xA;</xsl:text>
    <xsl:for-each select="FIXML/Batch/MktDef">
      <xsl:variable name="secDef" select="key('MktSeg', @MktSegID)" />

      <xsl:for-each select="BaseTrdgRules">
        <xsl:variable name="header" select="concat(../@MktID,',', ../@MktSegID, ',', @QtSideInd, ',', @FastMktPctg)" />
        <xsl:choose>
          <xsl:when test="$secDef">
            <xsl:for-each select="$secDef">
              <xsl:variable name="baseTrg" select="MktSegGrp/SecTrdgRules/BaseTrdgRules" />
              <xsl:value-of select="concat($header, ',', @TxnTm, ',', @PriSetPx, ',', $baseTrg/@ImpldMktInd, ',', $baseTrg/@MlegModel, '&#xA;')"/>
            </xsl:for-each>
          </xsl:when>
          <xsl:otherwise>
            <xsl:value-of select="concat($header, ',,,,&#xA;')"/>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:for-each>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>