如何在XDocument中删除重复的xml元素值?

时间:2011-05-19 15:06:00

标签: c# xml linq-to-xml

我有以下代码。我希望能够检查并删除StateRequestRecordGUID元素中包含的重复元素值。这是一个需要更正的示例xml文件。

<?xml version="1.0"?>
<StateSeparationRequestCollection xsi:schemaLocation="https://uidataexchange.org/schemas SeparationRequest.xsd" xmlns="https://uidataexchange.org/schemas" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <StateSeparationRequest>
        <StateRequestRecordGUID>30000000000000000000000000004000</StateRequestRecordGUID>
        <SSN>999999999</SSN>
        <ClaimEffectiveDate>2007-06-04</ClaimEffectiveDate>
        <ClaimNumber>012345678901234567</ClaimNumber>
        <StateEmployerAccountNbr>01234567890123456789</StateEmployerAccountNbr>
        <EmployerName>JC PENNEY COMPANY INC 234567890123456789012345678901234567890123456789012345678901234567890123456789</EmployerName>
        <FEIN>794741844</FEIN>
        <TypeofEmployerCode>1</TypeofEmployerCode>
        <TypeofClaimCode>1</TypeofClaimCode>
        <BenefitYearBeginDate>2007-06-04</BenefitYearBeginDate>
        <RequestingStateAbbreviation>ST</RequestingStateAbbreviation>
        <UIOfficeName>Park Oaks 012345678901234</UIOfficeName>
        <UIOfficePhone>6085264400</UIOfficePhone>
        <UIOfficeFax>6085269394</UIOfficeFax>
        <ClaimantLastName>SMITH-678901234567890123456789</ClaimantLastName>
        <OtherLastName>WILLIAMS-901234567890123456789</OtherLastName>
        <ClaimantFirstName>JOHN-56789012345678901234</ClaimantFirstName>
        <ClaimantMiddleInitial>T</ClaimantMiddleInitial>
        <ClaimantSuffix>Jr.-4567</ClaimantSuffix>
        <ClaimantJobTitle>Manager-8901234567890123456789</ClaimantJobTitle>
        <ClaimantReportedFirstDayofWork>2006-01-04</ClaimantReportedFirstDayofWork>
        <ClaimantReportedLastDayofWork>2007-05-31</ClaimantReportedLastDayofWork>
        <WagesWeeksNeededCode>WO</WagesWeeksNeededCode>
        <WagesNeededBeginDate>2005-05-01</WagesNeededBeginDate>
        <WagesNeededEndDate>2005-05-30</WagesNeededEndDate>
        <ClaimantSepReasonCode>1</ClaimantSepReasonCode>
        <ClaimantSepReasonComments>AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA</ClaimantSepReasonComments>
        <ReturntoWorkDate>2010-01-01</ReturntoWorkDate>
        <RequestDate>2006-06-07</RequestDate>
        <ResponseDueDate>2006-06-17</ResponseDueDate>
        <FormNumber>606C</FormNumber>
    </StateSeparationRequest>
    <StateSeparationRequest>
        <StateRequestRecordGUID>30000000000000000000000000004000</StateRequestRecordGUID>
        <SSN>999999999</SSN>
        <ClaimEffectiveDate>2007-06-04</ClaimEffectiveDate>
        <ClaimNumber>012345678901234567</ClaimNumber>
        <StateEmployerAccountNbr>01234567890123456789</StateEmployerAccountNbr>
        <EmployerName>JC PENNEY COMPANY INC 234567890123456789012345678901234567890123456789012345678901234567890123456789</EmployerName>
        <FEIN>794741844</FEIN>
        <TypeofEmployerCode>1</TypeofEmployerCode>
        <TypeofClaimCode>1</TypeofClaimCode>
        <BenefitYearBeginDate>2007-06-04</BenefitYearBeginDate>
        <RequestingStateAbbreviation>ST</RequestingStateAbbreviation>
        <UIOfficeName>Park Oaks 012345678901234</UIOfficeName>
        <UIOfficePhone>6085264400</UIOfficePhone>
        <UIOfficeFax>6085269394</UIOfficeFax>
        <ClaimantLastName>SMITH-678901234567890123456789</ClaimantLastName>
        <OtherLastName>WILLIAMS-901234567890123456789</OtherLastName>
        <ClaimantFirstName>JOHN-56789012345678901234</ClaimantFirstName>
        <ClaimantMiddleInitial>T</ClaimantMiddleInitial>
        <ClaimantSuffix>Jr.-4567</ClaimantSuffix>
        <ClaimantJobTitle>Manager-8901234567890123456789</ClaimantJobTitle>
        <ClaimantReportedFirstDayofWork>2006-01-04</ClaimantReportedFirstDayofWork>
        <ClaimantReportedLastDayofWork>2007-05-31</ClaimantReportedLastDayofWork>
        <WagesWeeksNeededCode>WO</WagesWeeksNeededCode>
        <WagesNeededBeginDate>2005-05-01</WagesNeededBeginDate>
        <WagesNeededEndDate>2005-05-30</WagesNeededEndDate>
        <ClaimantSepReasonCode>1</ClaimantSepReasonCode>
        <ClaimantSepReasonComments>AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA</ClaimantSepReasonComments>
        <ReturntoWorkDate>2010-01-01</ReturntoWorkDate>
        <RequestDate>2006-06-07</RequestDate>
        <ResponseDueDate>2006-06-17</ResponseDueDate>
        <FormNumber>606C</FormNumber>
    </StateSeparationRequest>

    </StateSeparationRequestCollection>

这将是更正的xml:

<?xml version="1.0"?>
<StateSeparationRequestCollection xsi:schemaLocation="https://uidataexchange.org/schemas SeparationRequest.xsd" xmlns="https://uidataexchange.org/schemas" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <StateSeparationRequest>
        <StateRequestRecordGUID>30000000000000000000000000004000</StateRequestRecordGUID>
        <SSN>999999999</SSN>
        <ClaimEffectiveDate>2007-06-04</ClaimEffectiveDate>
        <ClaimNumber>012345678901234567</ClaimNumber>
        <StateEmployerAccountNbr>01234567890123456789</StateEmployerAccountNbr>
        <EmployerName>JC PENNEY COMPANY INC 234567890123456789012345678901234567890123456789012345678901234567890123456789</EmployerName>
        <FEIN>794741844</FEIN>
        <TypeofEmployerCode>1</TypeofEmployerCode>
        <TypeofClaimCode>1</TypeofClaimCode>
        <BenefitYearBeginDate>2007-06-04</BenefitYearBeginDate>
        <RequestingStateAbbreviation>ST</RequestingStateAbbreviation>
        <UIOfficeName>Park Oaks 012345678901234</UIOfficeName>
        <UIOfficePhone>6085264400</UIOfficePhone>
        <UIOfficeFax>6085269394</UIOfficeFax>
        <ClaimantLastName>SMITH-678901234567890123456789</ClaimantLastName>
        <OtherLastName>WILLIAMS-901234567890123456789</OtherLastName>
        <ClaimantFirstName>JOHN-56789012345678901234</ClaimantFirstName>
        <ClaimantMiddleInitial>T</ClaimantMiddleInitial>
        <ClaimantSuffix>Jr.-4567</ClaimantSuffix>
        <ClaimantJobTitle>Manager-8901234567890123456789</ClaimantJobTitle>
        <ClaimantReportedFirstDayofWork>2006-01-04</ClaimantReportedFirstDayofWork>
        <ClaimantReportedLastDayofWork>2007-05-31</ClaimantReportedLastDayofWork>
        <WagesWeeksNeededCode>WO</WagesWeeksNeededCode>
        <WagesNeededBeginDate>2005-05-01</WagesNeededBeginDate>
        <WagesNeededEndDate>2005-05-30</WagesNeededEndDate>
        <ClaimantSepReasonCode>1</ClaimantSepReasonCode>
        <ClaimantSepReasonComments>AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA</ClaimantSepReasonComments>
        <ReturntoWorkDate>2010-01-01</ReturntoWorkDate>
        <RequestDate>2006-06-07</RequestDate>
        <ResponseDueDate>2006-06-17</ResponseDueDate>
        <FormNumber>606C</FormNumber>
    </StateSeparationRequest>
</StateSeparationRequestCollection>

int TotalCount = ssrcWrapper.EmployerTPASeparationRequestCollection.EmployerTPASeparationRequest.Count();
                string ssrcWrapperString = XmlSerializerUtils.SerializeToXMLstring(ssrcWrapper.EmployerTPASeparationRequestCollection);
                System.IO.StringReader myStringReader = new System.IO.StringReader(ssrcWrapperString);
                XmlReader xmlreader = XmlReader.Create(myStringReader);
                xmlreader.MoveToContent();

                for (int i = 0; i < TotalCount; i++)
                {
                    SqlConnection conn4 = new SqlConnection("Data Source=.\\sqlexpress;Initial Catalog=test_BdbCSSQL01;Persist Security Info=False;Integrated Security=SSPI;");
                    conn4.Open();
                    string sql = "SELECT * FROM SIDESStagingIN";
                    SqlDataAdapter da = new SqlDataAdapter(sql, conn4);
                    DataTable dt = new DataTable();
                    da.Fill(dt);
                    DataRow dr;
                    dr = dt.NewRow();
                    dt.Rows.Add(dr);

                    XDocument doc = XDocument.Load(xmlreader);
                    XNamespace ns = "https://uidataexchange.org/schemas";
                    var node = doc.Descendants(ns + "EmployerTPASeparationRequest");
                    var node2 = node.ElementAt(i);
                    string _StateRequestRecordGUID = "";

                    foreach (var element in node2.Elements())
                    {
                        if (element.Name.LocalName == "StateRequestRecordGUID")
                        {
                            _StateRequestRecordGUID = element.Value;

                        }
                        if (element.Name.LocalName == "AttachmentOccurrence")
                        {
                            //ZJR: TODO: Create new XDoc and write values to dbo.SIDESAttachmentIN
                            SqlConnection conn5 = new SqlConnection("Data Source=.\\sqlexpress;Initial Catalog=test_BdbCSSQL01;Persist Security Info=False;Integrated Security=SSPI;");
                            conn5.Open();
                            string sql2 = "SELECT * FROM SIDESAttachmentIN";
                            SqlDataAdapter da2 = new SqlDataAdapter(sql2, conn5);
                            DataTable dt2 = new DataTable();
                            da2.Fill(dt2);
                            DataRow dr2;
                            dr2 = dt2.NewRow();
                            dt2.Rows.Add(dr2);
                            dr2["AttachmentID"] = _StateRequestRecordGUID;
                            var attachmentNode = doc.Descendants(ns + "AttachmentOccurrence");

                            foreach (var attachmentElement in attachmentNode.Elements())
                            {
                                dr2[attachmentElement.Name.LocalName] = attachmentElement.Value;
                            }
                            dr["AttachmentID"] = _StateRequestRecordGUID;
                            SqlCommandBuilder sb2 = new SqlCommandBuilder(da2);
                            da2.Update(dt2);
                            if (conn5 != null) { conn5.Close(); }
                        }
                        if (dr.Table.Columns.Contains(element.Name.LocalName))
                        {
                            dr[element.Name.LocalName] = element.Value;
                        }
                    }
                    SqlCommandBuilder sb = new SqlCommandBuilder(da);
                    da.Update(dt);
                    if (conn4 != null) { conn4.Close(); }
                }

1 个答案:

答案 0 :(得分:4)

试一试:

    var duplicates = (from req in doc.Descendants(ns + "StateSeparationRequest")
                      group req by req.Descendants(ns + "StateRequestRecordGUID").First().Value
                      into g
                      where g.Count() > 1
                      select g.Skip(1)).SelectMany( elements => elements );
    foreach (var duplicate in duplicates)
    {
        duplicate.Remove();
    }

此查询基本上是:

通过唯一的StateRequestRecordGUID值对StateSeparationRequest元素进行分组。 对于具有多个匹配的StatateSeparationRequest的每个组,选择除第一个之外的所有组。 您剩下的是重复列表,您可以迭代并删除。