如何缩短此xml查询

时间:2014-10-03 13:04:33

标签: c# xml xpath

我花了3天时间阅读这个xml文件并将详细信息放入数据库。它的工作方式应该是,但我知道我读取这个xml文件的方式不正确。

如果xml文件大于2mb。 (包含大约1000条记录),加载需要1分钟以上。

您能否告诉我如何缩短查询时间。

这是xml

<?xml version="1.0" encoding="UTF-8"?>
<outputTree xmlns="http://www.ibm.com/software/analytics/spss/xml/oms" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/software/analytics/spss/xml/oms http://www.ibm.com/software/analytics/spss/xml/oms/spss-output-1.8.xsd">
    <command command="Summarize" displayOutlineValues="label" displayOutlineVariables="label" displayTableValues="label" displayTableVariables="label" lang="en" text="Summarize">
        <pivotTable subType="Report" text="Batch">
            <dimension axis="row" text="Cases">
                <group label="Test Site" text="Test Site" varName="PLANT_DESC" variable="true">
                    <group hide="true" text="A">
                        <group string="A" text="A" varName="PLANT_DESC">
                            <group label="Product" text="Product" varName="PROD_DESC" variable="true">
                                <group hide="true" text="A">
                                    <group string="S" text="S" varName="PROD_DESC">
                                        <group label="Batch Number" text="Batch Number" varName="BATCH_NO" variable="true">
                                            <group hide="true" text="A">
                                                <group number="3704542" text="3704542" varName="BATCH_NO">
                                                    <category number="1" text="1">
                                                        <dimension axis="column" text="Variables">
                                                            <category label="Batch Run" text="Batch Run" varName="BATCH_RUN_ID" variable="true">
                                                                <cell number="4202" text="4202" varName="BATCH_RUN_ID"/>
                                                            </category>
                                                            <category label="Application" text="Application" varName="APP_ID" variable="true">
                                                                <cell label="Calibration" number="101" text="Calibration" varName="APP_ID"/>
                                                            </category>
                                                            <category label="Date Tested" text="Date Tested" varName="TEST_DATE" variable="true">
                                                                <cell date="2014-09-23T10:53:19" format="date" text="23-SEP-2014" varName="TEST_DATE"/>
                                                            </category>
                                                        </dimension>
                                                    </category>
                                                </group>            
                                            </group>
                                        </group>
                                    </group>                                            
                                </group>
                            </group>
                        </group>
                    </group>
                </group>
            </dimension>
        </pivotTable>
    </command>
</outputTree>

这是c#

XElement root = XElement.Load(Page.Server.MapPath(@"oril.xml"));
XNamespace ad = "http://www.ibm.com/software/analytics/spss/xml/oms";

var cats = from cat in root.Descendants(ad + "dimension").Where
               (cat => (string)cat.Attribute("axis") == "column" && (string)cat.Attribute("text") == "Variables")

           select new
           {
               BATCH_NO = cat.Parent.Parent.Attribute("number").Value,
               RUN_NO = cat.Parent.Attribute("number").Value,

               //// 1
               BATCH_RUN_ID = cat.Descendants(ad + "category").Elements(ad + "cell")
                    .Where(a => (string)a.Attribute("varName") == "BATCH_RUN_ID")
                    .Select(c => c.Attribute("number").Value),

               //// 2
               APP_ID = cat.Descendants(ad + "category").Elements(ad + "cell")
                    .Where(a => (string)a.Attribute("varName") == "APP_ID")
                    .Select(c => c.Attribute("label").Value),

               //// 3
               TEST_DATE = cat.Descendants(ad + "category").Elements(ad + "cell")
                       .Where(a => (string)a.Attribute("varName") == "TEST_DATE")
                       .Select(c => c.Attribute("date").Value),
               ////
               //// Another 12
               ////
           };

foreach (var cat in cats)
{
    foreach (string s in cat.BATCH_RUN_ID)
    {
        xmlTitle.Text += "BATCH_NO: " + cat.BATCH_NO + " </br>";
        xmlTitle.Text += "RUN_NO: " + cat.RUN_NO + " </br>";
        xmlTitle.Text += "BATCH_RUN_ID: " + s + " </br>";
    }

    foreach (string s in cat.APP_ID)
    {
        xmlTitle.Text += "APP_ID: " + s + " </br>";
        i_APP_ID = s;
    }
    foreach (string s in cat.TEST_DATE)
    {
        xmlTitle.Text += "TEST_DATE: " + s + " </br>";
        i_TEST_DATE = s;
    }
    foreach (string s in cat.CB_USED)
    {
        xmlTitle.Text += "CB_USED: " + s + " </br>";
        i_CB_USED = s;
    }
    ////
    //// Another 12
    ////
}

2 个答案:

答案 0 :(得分:2)

你可以使用Objects,因为这是一种面向对象的语言,可以减轻你.Descendants().Elements()的一些痛苦。

public class Category
{
    public readonly XElement self;
    public readonly XNamespace ns;
    public Category(XNamespace xn, XElement cat) { self = cat; ns = xn; }
    public string Name { get { return (string)self.Attribute("varName"); } }
    public Cell Cell { get { return _Cell ?? (_Cell = new Cell(self.Elements(ns+"cell").First())); } }
    Cell _Cell;
}

public class Cell
{
    public readonly XElement self;
    public Cell(XElement cell) { self = cell; }
    public string Name { get { return (string)self.Attribute("varName"); } }
    public string Number { get { return (string)self.Attribute("number"); } }
    public string Date { get { return (string)self.Attribute("date"); } }
    public string Label { get { return (string)self.Attribute("label"); } }
}

public class Dimension
{
    public readonly XElement self;
    public readonly XNamespace ns;
    public Dimension(XNamespace xn, XElement dim) { self = dim; ns = xn; }
    public string Axis { get { return (string)self.Attribute("axis"); } }
    public string Text { get { return (string)self.Attribute("text"); } }
    public string BatchNo { get { return self.Parent.Parent.Attribute("number").Value } }
    public string RunNo { get { return self.Parent.Attribute("number").Value } }
    public Category[] Categories
    { get { return _Categories ?? (_Categories = self.Elements(ns + "category")
                             .Select(cat => new Category(ns, cat))
                             .ToArray()); }
    }
    Category[] _Categories;
}

然后使用帖子中定义的rootad。如果没有别的,它更具可读性, 但它应该更快,因为一旦在类别中创建了Cell,它就不需要找到它 在每个电话呼叫。对于维度中的每个类别也是如此。

var dims = root.Descendants(ad + "dimension")
               .Select(dim => new Dimension(ad, dim))
               .Where(Dim => Dim.Axis == "column" && Dim.Text == "Variables");
var cats = dims.Select(dim => new
{
    BATCH_NO = dim.BatchNo,
    RUN_NO = dim.RunNo,

    //// 1
    BATCH_RUN_ID = dim.Categories
                      .Where(cat => cat.Name == "BATCH_RUN_ID")
                      .Select(cat => cat.Cell.Number),
    //// 2
    APP_ID = dim.Categories
                      .Where(cat => cat.Name == "APP_ID")
                      .Select(cat => cat.Cell.Label),

    //// etc
}

ps我手动输入它,它可能不会直接编译,但它会像缺少;

那样简单

答案 1 :(得分:1)

首先,

当你需要连接很多循环字符串时,你需要先使用StringBuilder来帮助它

example:
StringBuilder sb = new StringBuilder();

foreach (var cat in cats)
{
    foreach (string s in cat.BATCH_RUN_ID)
    {
        //xmlTitle.Text += "BATCH_NO: " + cat.BATCH_NO + " </br>";
        sb.append("BATCH_NO: ");
        sb.append( cat.BATCH_NO );
        sb.append(" </br>");
        // more and more, without using String + String
    }
}

//at the end of the loop, just put it back to xml text
xmlTitle.Text = sb.toString();