合并两个XElements

时间:2009-12-12 04:51:35

标签: vb.net lambda linq-to-xml

我不太确定如何问这个,或者这是否存在,但我需要合并两个XElements,其中一个优先于另一个,只能成为一个元素。

这里的偏好是VB.NET和Linq,但是如果它演示如何在没有编码的情况下手动选择并解析每个元素和属性,那么任何语言都会有所帮助。

例如,假设我有两个元素。幽默我,他们和他们一样不同。

1

<HockeyPlayer height="6.0" hand="left">
<Position>Center</Position>
<Idol>Gordie Howe</Idol>
</HockeyPlayer>

2

<HockeyPlayer height="5.9" startinglineup="yes">
<Idol confirmed="yes">Wayne Gretzky</Idol>
</HockeyPlayer>

合并的结果将是

<HockeyPlayer height="6.0" hand="left" startinglineup="yes">
<Position>Center</Position>
<Idol confirmed="yes">Gordie Howe</Idol>
</HockeyPlayer>

注意一些事项:#1的height属性值覆盖#2。 hand属性和值只是从#1复制而来(它在#2中不存在)。来自#2的startinglineup属性和值被复制(#1中不存在)。 #1中的Position元素被复制(#2中不存在)。 #1中的Idol元素值将覆盖#2,但#2的属性confirmed(#1中不存在)将被复制。

净网,#1优先于#2,其中存在冲突(意味着两者具有相同的元素和/或属性),并且在没有冲突的情况下,它们都复制到最终结果。

我试过这个,但似乎找不到任何东西,可能是因为我用来搜索的词太一般了。任何想法或解决方案(尤其是Linq)?

2 个答案:

答案 0 :(得分:6)

为了其他人寻找同样的事情,我认为两个人的贡献早就失去了兴趣......我需要做一些类似但更完整的事情。尽管如此,仍然不完全完整,因为XMLDoc说它不能很好地处理非元素内容,但我不需要因为我的非元素内容是文本或不重要。随意增强和重新发布... 哦,它是C#4.0,因为我使用的是......

/// <summary>
/// Provides facilities to merge 2 XElement or XML files. 
/// <para>
/// Where the LHS holds an element with non-element content and the RHS holds 
/// a tree, the LHS non-element content will be applied as text and the RHS 
/// tree ignored. 
/// </para>
/// <para>
/// This does not handle anything other than element and text nodes (infact 
/// anything other than element is treated as text). Thus comments in the 
/// source XML are likely to be lost.
/// </para>
/// <remarks>You can pass <see cref="XDocument.Root"/> if it you have XDocs 
/// to work with:
/// <code>
/// XDocument mergedDoc = new XDocument(MergeElements(lhsDoc.Root, rhsDoc.Root);
/// </code></remarks>
/// </summary>
public class XmlMerging
{
    /// <summary>
    /// Produce an XML file that is made up of the unique data from both
    /// the LHS file and the RHS file. Where there are duplicates the LHS will 
    /// be treated as master
    /// </summary>
    /// <param name="lhsPath">XML file to base the merge off. This will override 
    /// the RHS where there are clashes</param>
    /// <param name="rhsPath">XML file to enrich the merge with</param>
    /// <param name="resultPath">The fully qualified file name in which to 
    /// write the resulting merged XML</param>
    /// <param name="options"> Specifies the options to apply when saving. 
    /// Default is <see cref="SaveOptions.OmitDuplicateNamespaces"/></param>
    public static bool TryMergeXmlFiles(string lhsPath, string rhsPath, 
        string resultPath, SaveOptions options = SaveOptions.OmitDuplicateNamespaces)
    {
        try
        {
            MergeXmlFiles(lhsPath, rhsPath, resultPath);
        }
        catch (Exception)
        {
            // could integrate your logging here
            return false;
        }
        return true;
    }

    /// <summary>
    /// Produce an XML file that is made up of the unique data from both the LHS
    /// file and the RHS file. Where there are duplicates the LHS will be treated 
    /// as master
    /// </summary>
    /// <param name="lhsPath">XML file to base the merge off. This will override 
    /// the RHS where there are clashes</param>
    /// <param name="rhsPath">XML file to enrich the merge with</param>
    /// <param name="resultPath">The fully qualified file name in which to write 
    /// the resulting merged XML</param>
    /// <param name="options"> Specifies the options to apply when saving. 
    /// Default is <see cref="SaveOptions.OmitDuplicateNamespaces"/></param>
    public static void MergeXmlFiles(string lhsPath, string rhsPath, 
        string resultPath, SaveOptions options = SaveOptions.OmitDuplicateNamespaces)
    {
        XElement result = 
            MergeElements(XElement.Load(lhsPath), XElement.Load(rhsPath));
        result.Save(resultPath, options);
    }

    /// <summary>
    /// Produce a resulting <see cref="XElement"/> that is made up of the unique 
    /// data from both the LHS element and the RHS element. Where there are 
    /// duplicates the LHS will be treated as master
    /// </summary>
    /// <param name="lhs">XML Element tree to base the merge off. This will 
    /// override the RHS where there are clashes</param>
    /// <param name="rhs">XML element tree to enrich the merge with</param>
    /// <returns>A merge of the left hand side and right hand side element 
    /// trees treating the LHS as master in conflicts</returns>
    public static XElement MergeElements(XElement lhs, XElement rhs)
    {
        // if either of the sides of the merge are empty then return the other... 
        // if they both are then we return null
        if (rhs == null) return lhs;
        if (lhs == null) return rhs;

        // Otherwise build a new result based on the root of the lhs (again lhs 
        // is taken as master)
        XElement result = new XElement(lhs.Name);

        MergeAttributes(result, lhs.Attributes(), rhs.Attributes());

        // now add the lhs child elements merged to the RHS elements if there are any
        MergeSubElements(result, lhs, rhs);
        return result;
    }

    /// <summary>
    /// Enrich the passed in <see cref="XElement"/> with the contents of both 
    /// attribute collections.
    /// Again where the RHS conflicts with the LHS, the LHS is deemed the master
    /// </summary>
    /// <param name="elementToUpdate">The element to take the merged attribute 
    /// collection</param>
    /// <param name="lhs">The master set of attributes</param>
    /// <param name="rhs">The attributes to enrich the merge</param>
    private static void MergeAttributes(XElement elementToUpdate, 
        IEnumerable<XAttribute> lhs, IEnumerable<XAttribute> rhs)
    {
        // Add in the attribs of the lhs... we will only add new attribs from 
        // the rhs duplicates will be ignored as lhs is master
        elementToUpdate.Add(lhs);

        // collapse the element names to save multiple evaluations... also why 
        // we ain't putting this in as a sub-query
        List<XName> lhsAttributeNames = 
            lhs.Select(attribute => attribute.Name).ToList();
        // so add in any missing attributes
        elementToUpdate.Add(rhs.Where(attribute => 
            !lhsAttributeNames.Contains(attribute.Name)));
    }

    /// <summary>
    /// Enrich the passed in <see cref="XElement"/> with the contents of both 
    /// <see cref="XElement.Elements()"/> subtrees.
    /// Again where the RHS conflicts with the LHS, the LHS is deemed the master.
    /// Where the passed elements do not have element subtrees, but do have text 
    /// content that will be used. Again the LHS will dominate
    /// </summary>
    /// <remarks>Where the LHS has text content and no subtree, but the RHS has 
    /// a subtree; the LHS text content will be used and the RHS tree ignored. 
    /// This may be unexpected but is consistent with other .NET XML 
    /// operations</remarks>
    /// <param name="elementToUpdate">The element to take the merged element 
    /// collection</param>
    /// <param name="lhs">The element from which to extract the master 
    /// subtree</param>
    /// <param name="rhs">The element from which to extract the subtree to 
    /// enrich the merge</param>
    private static void MergeSubElements(XElement elementToUpdate, 
        XElement lhs, XElement rhs)
    {
        // see below for the special case where there are no children on the LHS
        if (lhs.Elements().Count() > 0)
        {
            // collapse the element names to a list to save multiple evaluations...
            // also why we ain't putting this in as a sub-query later
            List<XName> lhsElementNames = 
                lhs.Elements().Select(element => element.Name).ToList();

            // Add in the elements of the lhs and merge in any elements of the 
            //same name on the RHS
            elementToUpdate.Add(
                lhs.Elements().Select(
                    lhsElement => 
                        MergeElements(lhsElement, rhs.Element(lhsElement.Name))));

            // so add in any missing elements from the rhs
            elementToUpdate.Add(rhs.Elements().Where(element => 
                !lhsElementNames.Contains(element.Name)));
        }
        else
        {
            // special case for elements where they have no element children 
            // but still have content:
            // use the lhs text value if it is there
            if (!string.IsNullOrEmpty(lhs.Value))
            {
                elementToUpdate.Value = lhs.Value;
            }
            // if it isn't then see if we have any children on the right
            else if (rhs.Elements().Count() > 0)
            {
                // we do so shove them in the result unaltered
                elementToUpdate.Add(rhs.Elements());
            }
            else
            {
                // nope then use the text value (doen't matter if it is empty 
                //as we have nothing better elsewhere)
                elementToUpdate.Value = rhs.Value;
            }
        }
    }
}

答案 1 :(得分:4)

这是一个控制台应用程序,可以生成问题中列出的结果。它使用递归来处理每个子元素。它没有检查的一件事是Elem2中显示的子元素不在Elem1中,但希望这会让您开始寻求解决方案。

我不确定我是否会说这是最好的解决方案,但确实有效。

Module Module1

Function MergeElements(ByVal Elem1 As XElement, ByVal Elem2 As XElement) As XElement

    If Elem2 Is Nothing Then
        Return Elem1
    End If

    Dim result = New XElement(Elem1.Name)

    For Each attr In Elem1.Attributes
        result.Add(attr)
    Next

    Dim Elem1AttributeNames = From attr In Elem1.Attributes _
                              Select attr.Name

    For Each attr In Elem2.Attributes
        If Not Elem1AttributeNames.Contains(attr.Name) Then
            result.Add(attr)
        End If
    Next

    If Elem1.Elements().Count > 0 Then
        For Each elem In Elem1.Elements
            result.Add(MergeElements(elem, Elem2.Element(elem.Name)))
        Next
    Else
        result.Value = Elem1.Value
    End If

    Return result
End Function

Sub Main()
    Dim Elem1 = <HockeyPlayer height="6.0" hand="left">
                    <Position>Center</Position>
                    <Idol>Gordie Howe</Idol>
                </HockeyPlayer>

    Dim Elem2 = <HockeyPlayer height="5.9" startinglineup="yes">
                    <Idol confirmed="yes">Wayne Gretzky</Idol>
                </HockeyPlayer>

    Console.WriteLine(MergeElements(Elem1, Elem2))
    Console.ReadLine()
End Sub

End Module

修改:我刚注意到该功能缺少As XElement。我真的很惊讶它没有那个工作!我每天都在使用VB.NET,但它有一些我还不完全理解的怪癖。

相关问题