是否有用于从XPS文档中提取信息的对象模型?

时间:2013-08-19 14:47:53

标签: .net vb.net xps

我已经设法从XPS文档中检索文本并根据需要使用它(thanks to this answer),但我想知道是否还有一个与对象相关的模型(而不是使用XmlReader)使用它会自动将所有元素放入一个可以在代码中循环的对象集合中。

这是一个人为的例子,但有点像pesudo代码:

    'open the xps document
    Dim xpsDoc As New XpsDocument(pathToTestXps, System.IO.FileAccess.Read)
    'load the fixed document squences
    Dim fixedDocSeqReader As IXpsFixedDocumentSequenceReader = xpsDoc.FixedDocumentSequenceReader
    'the content will go here
    Dim sbContent As New System.Text.StringBuilder()
    'loops the fixed focuments
    For Each docReader As IXpsFixedDocumentReader In fixedDocSeqReader.FixedDocuments

        'loop the fixed pages
        For Each fixedPageReader As IXpsFixedPageReader In docReader.FixedPages

'BEGIN PSEUDO CODE
            Dim content as IXpsContentCollection = fixedPageReader.Contents

            For Each contentItem as IXpsContentItem In Contents
                Select Case contentItem.Type
                    Case IXpsContentItem.ContentType.Canvas 'Group 
                        'loop content items, check their type, do stuff
                    Case IXpsContentItem.ContentType.Glyph 'Text
                         Dim str As String = DirectCast(contentItem, Glyph).UniCodeString
                         'do something with the string
                    Case IXpsContentItem.ContentType.Path 'Shape
                         'get the shape properties etc
                    Case Else
                        Throw New ApplicationException("XPS Content Type Not Expected:" & contentItem.Type.ToString)
                End Select
            Next
'END PSEUDO CODE

        Next

    Next

如果没有这样的模型,使用XMLReader最简单的方法是什么,是否有一个很好的XML元素和属性参考?

对于上下文,目前,我只是代替上面的伪代码:

            'get the xml for the fixed pages
            Dim pageContentReader As System.Xml.XmlReader = fixedPageReader.XmlReader
            While pageContentReader.Read()

                'if it is a canvas, it's a new line or some other stuff
                If pageContentReader.Name = XmlElementCanvas Then

                    'other stuff won't have attibutes
                    If pageContentReader.HasAttributes Then

                        'remove the last char as it will be an excess comma
                        If sbContent.Length > 0 Then
                            sbContent.Length = sbContent.Length - 1
                            sbContent.AppendLine()
                        End If
                    End If
                End If

                'if it is a glyph, it's the text we want
                If pageContentReader.Name = XmlElementGlyphs Then

                    'unsure, but it was in the example code, so we'll keep it
                    If pageContentReader.HasAttributes Then

                        'unicode string attribute has the text we want
                        If pageContentReader.GetAttribute(XmlAttribUnicodeString) IsNot Nothing Then

                            'add the text and a comma
                            sbContent.Append(pageContentReader.GetAttribute(XmlAttribUnicodeString))
                            sbContent.Append(",")
                        End If
                    End If
                End If

            End While

0 个答案:

没有答案