将多个xml表(在同一根节点下)读入DataTables / DataSet

时间:2012-11-16 07:46:56

标签: c# xml linq linq-to-xml

我有一个XML源文档,在Root节点下有多个“report”节点。我需要将每个“report”节点读入自己的DataTable中。看起来我需要使用xsl样式表来转换我的源XML数据,以便能够很好地工作或者像我这样迭代我的xml元素:

namespace XmlParse2
{
    class Program
    {
        static IEnumerable<string> expectedFields = new List<string>() { "Field1", "Field2", "Field3", "Field4" };

        static void Main(string[] args)
        {
            string xml = @"<Root>
                             <Report1>
                               <Row>
                                 <Field1>data1-1</Field1>
                                 <Field2>data1-2</Field2>
                                 <Field4>data1-4</Field4>
                               </Row>
                               <Row>
                                 <Field1>data2-1</Field1>
                                 <Field2>data2-2</Field2>
                               </Row>
                             </Report1>
                             <Report2>
                               <Row>
                                 <Field1>data1-1</Field1>
                                 <Field4>data1-4</Field4>
                               </Row>
                               <Row>
                                 <Field1>data2-1</Field1>
                                 <Field3>data2-3</Field3>
                               </Row>
                             </Report2>
                           </Root>";

            DataTable report1 = new DataTable("Report1");
            report1.Columns.Add("Field1");
            report1.Columns.Add("Field2");
            report1.Columns.Add("Field3");
            report1.Columns.Add("Field4");

            DataTable report2 = new DataTable("Report2");
            report2.Columns.Add("Field1");
            report2.Columns.Add("Field2");
            report2.Columns.Add("Field3");
            report2.Columns.Add("Field4");

            var doc = XDocument.Parse(xml);
            var report1Data = doc.Root.Elements("Report1").Elements("Row").Select(record => MapRecord(record));
            var report2Data = doc.Root.Elements("Report2").Elements("Row").Select(record => MapRecord(record));

            report1 = addRows(report1, report1Data);
            report2 = addRows(report2, report2Data);

            Console.ReadLine();
        }

        public static Dictionary<string, string> MapRecord(XElement element)
        {
            var output = new Dictionary<string, string>();
            foreach (var field in expectedFields)
            {
                bool hasField = element.Elements(field).Any();
                if (hasField)
                {
                    output.Add(field, element.Elements(field).First().Value);
                }
            }
            return output;
        }

        public static DataTable addRows(DataTable table, IEnumerable<Dictionary<string, string>> data)
        {
            foreach (Dictionary<string, string> dict in data)
            {
                DataRow row = table.NewRow();

                foreach(var item in dict) 
                {
                    row[item.Key] = item.Value;
                }

                table.Rows.Add(row);
            }

            return table;
        }
    }
}

我的源数据无法正常工作的问题似乎是Report1和Report2都有名为“Row”的子节点,而我尝试使用DataSet.ReadXml做的事情并不成功,因为我的代码只是将所有名为Row的节点组合在一起到一个DataTable而不是单独的DataTables。 :/

我错过了什么?

1 个答案:

答案 0 :(得分:1)

XDocument xdoc = XDocument.Load(path_to_xml);
var tables = xdoc.Root.Elements()
                 .Select(report => {
                     DataTable table = new DataTable(report.Name.LocalName);
                     var fields = report
                            .Descendants("Row")
                            .SelectMany(row => row.Elements()
                                                  .Select(e => e.Name.LocalName))
                            .Distinct();

                     foreach(string field in fields)
                         table.Columns.Add(field);

                     foreach(var row in report.Descendants("Row"))
                     {
                         DataRow dr = table.NewRow();
                         foreach(var field in row.Elements())
                             dr[field.Name.LocalName] = (string)field;
                         table.Rows.Add(dr);
                     }                                   

                     return table;
                });

此查询将返回IEnumerable<DataTable>。每个数据表将仅包含那些具有xml值的列。从xml检索的列名,每个表可能不同。对于您的样本结构将采用这种方式:

DataTable: Report1
  Columns: Field1, Field2, Field4

DataTable: Report2
  Columns: Field1, Field3, Field4

所有行数据都将添加到每个表中。


您可以向方法中提取一些代码。它将使代码更容易理解:

XDocument xdoc = XDocument.Load(path_to_xml);
var tables = xdoc.Root.Elements()
                 .Select(report => CreateTableFrom(report));

方法:

private static DataTable CreateTableFrom(XElement report)
{
    DataTable table = new DataTable(report.Name.LocalName);
    table.Columns.AddRange(GetColumnsOf(report));

    foreach (var row in report.Descendants("Row"))
    {
        DataRow dr = table.NewRow();
        foreach (var field in row.Elements())
            dr[field.Name.LocalName] = (string)field;
        table.Rows.Add(dr);
    }

    return table;
}

private static DataColumn[] GetColumnsOf(XElement report)
{
    return report.Descendants("Row")
                 .SelectMany(row => row.Elements().Select(e => e.Name.LocalName))
                 .Distinct()
                 .Select(field => new DataColumn(field))
                 .ToArray();
}