WebBrowser,处理pdf加载完成

时间:2013-09-10 21:23:00

标签: vb.net webbrowser-control

我想知道是否有人知道一个简单的方法让.pdf文件在加载时触发readystate。我正在构建一个程序来打开url并截取屏幕截图,然后将它们放入excel中。

Web浏览器将正确加载html文档,但在加载While Not pageready文件时卡在.pdf中。浏览器控件正确呈现.pdf

Private Sub btngo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btngo.Click
    Dim file As String
    Dim Obj As New Object
    Dim result As String
    Dim sheet As String = "sheet1"
    Dim xlApp As New Excel.Application

    If lblpath.Text <> "" Then
        file = lblpath.Text
        Dim xlWorkBook = xlApp.Workbooks.Open(file)
        Dim xlWorkSheet = xlWorkBook.Worksheets(sheet)
        Dim range = xlWorkSheet.UsedRange

        ProgressBar1.Value = 0

        For rCnt = 4 To range.Rows.Count
            'url cell
            Obj = CType(range.Cells(rCnt, 2), Excel.Range)
            ' Obj.value now contains the value in the cell.. 
            Try
                ' Creates an HttpWebRequest with the specified URL. 
                Dim myHttpWebRequest As HttpWebRequest = CType(WebRequest.Create(Obj.value), HttpWebRequest)
                ' Sends the request and waits for a response. 
                Dim myHttpWebResponse As HttpWebResponse = CType(myHttpWebRequest.GetResponse(), HttpWebResponse)
                If myHttpWebResponse.StatusCode = HttpStatusCode.OK Then
                    result = myHttpWebResponse.StatusCode
                    WebBrowser1.ScrollBarsEnabled = False
                    WebBrowser1.Navigate(myHttpWebRequest.RequestUri)

                    WaitForPageLoad()

                    CaptureWebBrowser(WebBrowser1)
                End If
                ' Release the resources of the response.
                myHttpWebResponse.Close()

            Catch ex As WebException
                result = (ex.Message)
            Catch ex As Exception
                result = (ex.Message)
            End Try


            RichTextBox1.AppendText(result & "    " & Obj.value & vbNewLine)

            If radpre.Checked = True Then
                range.Cells(rCnt, 3).value = result
            ElseIf radcob.Checked = True Then
                range.Cells(rCnt, 4).value = result
            ElseIf radpost.Checked = True Then
                range.Cells(rCnt, 5).value = result

            End If


            ProgressBar1.Value = rCnt / range.Rows.Count * 100
        Next

        With xlApp
            .DisplayAlerts = False
            xlWorkBook.SaveAs(lblpath.Text.ToString)
            .DisplayAlerts = True
        End With

        xlWorkBook.Close()
        xlApp.Quit()

        'reclaim memory
        Marshal.ReleaseComObject(xlApp)
        xlApp = Nothing
    End If
End Sub

Private Function CaptureWebBrowser(ByVal wb As WebBrowser) As Image
    Try
        Dim hBitmap As Bitmap = New Bitmap(wb.Width, wb.Height)
        wb.DrawToBitmap(hBitmap, wb.Bounds)
        Dim img As Image = hBitmap
        Return img
    Catch ex As Exception
        MessageBox.Show(ex.Message)
    End Try
    Return Nothing
End Function


Private Sub WaitForPageLoad()
    AddHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
    While Not pageready
        Application.DoEvents()
    End While
    pageready = False
End Sub

Private Sub PageWaiter(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)
    If WebBrowser1.ReadyState = WebBrowserReadyState.Complete Then
        pageready = True
        RemoveHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
    End If
End Sub

更新至已解决的


我对反馈非常满意。我非常喜欢提供的答案Noseratio。我不知道使用代码模式不是最佳实践。打开.pdf或任何其他文档时,基于网络的readyState永远不会从0更改。看到这个程序对我来说只是一种不工作的方式,我只对捕获.html.htm感到满意。

我的要求是

  1. 打开Excel文档
  2. 解析位于Excel文档中的链接
  3. 确定响应代码
  4. 编写响应代码,如果可能,请将截图设为excel
  5. 程序解析和检索反馈的速度远远快于我能够手动执行的速度。 .html.htm的屏幕截图为excel文件的非技术查看者提供了从生产到COB以及返回生产环境的成功迁移的证据。

    Noseratio所述的此代码不遵循最佳做法,也不符合高质量要求。这是一个快速而肮脏的实现。

    Option Infer On
    Imports Microsoft.Office.Interop
    Imports System.Net
    Imports System.Runtime.InteropServices
    
    Public Class Form1
    
    
    Public Property pageready As Boolean
    
    Private Sub OpenToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles OpenToolStripMenuItem.Click
        OpenFileDialog1.ShowDialog()
    End Sub
    
    Private Sub OpenFileDialog1_FileOk(ByVal sender As System.Object, ByVal e As System.ComponentModel.CancelEventArgs) Handles OpenFileDialog1.FileOk
        lblpath.Text = OpenFileDialog1.FileName.ToString
    End Sub
    
    Private Sub btngo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btngo.Click
        Dim file As String
        Dim Obj As New Object
        Dim result As String
        Dim sheet As String = "sheet1"
        Dim xlApp As New Excel.Application
        Dim img As Bitmap
        Dim path As String = "C:\Documents and Settings\user\My Documents\Visual Studio 2010\Projects\COB-HTML-Tool\COB-HTML-Tool\bin\Debug\tmp.bmp"
        If lblpath.Text <> "" Then
            file = lblpath.Text
            Dim xlWorkBook = xlApp.Workbooks.Open(file)
            Dim xlWorkSheet = xlWorkBook.Worksheets(sheet)
            Dim range = xlWorkSheet.UsedRange
    
            ProgressBar1.Value = 0
    
            For rCnt = 4 To range.Rows.Count
                'url cell
                Obj = CType(range.Cells(rCnt, 2), Excel.Range)
                ' Obj.value now contains the value in the cell.. 
                Try
                    ' Creates an HttpWebRequest with the specified URL. 
                    Dim myHttpWebRequest As HttpWebRequest = CType(WebRequest.Create(Obj.value), HttpWebRequest)
                    ' Sends the request and waits for a response. 
                    Dim myHttpWebResponse As HttpWebResponse = CType(myHttpWebRequest.GetResponse(), HttpWebResponse)
                    If myHttpWebResponse.StatusCode = HttpStatusCode.OK Then
                        result = myHttpWebResponse.StatusCode
    
    
                        Dim len As Integer = myHttpWebRequest.RequestUri.ToString.Length - 4
                        If myHttpWebRequest.RequestUri.ToString.Substring(len) = ".htm" Or
                            myHttpWebRequest.RequestUri.ToString.Substring(len - 1) = ".html" Or
                            myHttpWebRequest.RequestUri.ToString.Substring(len) = ".asp" Then
                            WebBrowser1.Navigate(myHttpWebRequest.RequestUri)
                            WaitForPageLoad()
    
                            img = CaptureWebBrowser(WebBrowser1)
                            img.Save(path)
                        End If
    
                    End If
        ' Release the resources of the response.
        myHttpWebResponse.Close()
    
                Catch ex As WebException
            result = (ex.Message)
        Catch ex As Exception
            result = (ex.Message)
        End Try
    
    
                RichTextBox1.AppendText(result & "    " & Obj.value & vbNewLine)
    
                If radpre.Checked = True Then
                    range.Cells(rCnt, 3).value = result
    
                    If img Is Nothing Then
                    Else
                        If Dir(path) <> "" Then
                            range.Cells(rCnt, 4).Select()
                            Dim opicture As Object
                            opicture = xlApp.ActiveSheet.Pictures.Insert(path)
                            opicture.ShapeRange.LockAspectRatio = True
                            opicture.ShapeRange.width = 170
                            opicture.ShapeRange.height = 170
                            My.Computer.FileSystem.DeleteFile(path)
    
                        End If
                    End If
                ElseIf radcob.Checked = True Then
                    range.Cells(rCnt, 5).value = result
                    If img Is Nothing Then
                    Else
                        If Dir(path) <> "" Then
                            range.Cells(rCnt, 6).Select()
                            Dim opicture As Object
                            opicture = xlApp.ActiveSheet.Pictures.Insert(path)
                            opicture.ShapeRange.LockAspectRatio = True
                            opicture.ShapeRange.width = 170
                            opicture.ShapeRange.height = 170
                            My.Computer.FileSystem.DeleteFile(path)
                        End If
                    End If
                ElseIf radpost.Checked = True Then
                    range.Cells(rCnt, 7).value = result
                    If img Is Nothing Then
                    Else
                        If Dir(path) <> "" Then
                            range.Cells(rCnt, 8).Select()
                            Dim opicture As Object
                            opicture = xlApp.ActiveSheet.Pictures.Insert(path)
                            opicture.ShapeRange.LockAspectRatio = True
                            opicture.ShapeRange.width = 170
                            opicture.ShapeRange.height = 170
                            My.Computer.FileSystem.DeleteFile(path)
                        End If
                    End If
                End If
    
    
                ProgressBar1.Value = rCnt / range.Rows.Count * 100
            Next
    
            With xlApp
                .DisplayAlerts = False
                xlWorkBook.SaveAs(lblpath.Text.ToString)
                .DisplayAlerts = True
            End With
    
            xlWorkBook.Close()
            xlApp.Quit()
    
            'reclaim memory
            Marshal.ReleaseComObject(xlApp)
            xlApp = Nothing
        End If
    End Sub
    Private Function CaptureWebBrowser(ByVal wb As WebBrowser) As Image
    
        Try
            wb.ScrollBarsEnabled = False
            Dim hBitmap As Bitmap = New Bitmap(wb.Width, wb.Height)
            wb.DrawToBitmap(hBitmap, wb.Bounds)
            Dim img As Image = hBitmap
            Return img
        Catch ex As Exception
            MessageBox.Show(ex.Message)
        End Try
        Return Nothing
    End Function
    
    
    Private Sub WaitForPageLoad()
        AddHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
        While Not pageready
            Application.DoEvents()
            System.Threading.Thread.Sleep(200)
        End While
        pageready = False
    End Sub
    
    Private Sub PageWaiter(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs)
        If WebBrowser1.ReadyState = WebBrowserReadyState.Complete Then
            pageready = True
            RemoveHandler WebBrowser1.DocumentCompleted, New WebBrowserDocumentCompletedEventHandler(AddressOf PageWaiter)
        End If
    End Sub
    
    
    End Class
    

1 个答案:

答案 0 :(得分:0)

不幸的是,您将无法使用webBrowser.DrawToBitmap来获取PDF视图的快照。在撰写本文时, Adob​​e Acrobat Reader ActiveX控件不支持在自定义设备上下文中呈现,因此此方法不起作用,以及发送WM_PRINT或调用{ {1}},直接在通过WebBrowser上的Reader ActiveX对象上(我试过了,I'm not alone)。正确的解决方案是使用第三方PDF呈现组件。

在旁注中,您应该避免使用这样的代码模式:

IViewObject::Draw

这是一个busy waiting紧密的循环,徒劳地消耗CPU周期。至少,在循环中添加一些While Not pageready Application.DoEvents() End While ,但总体上你应该避免使用using Application.DoEvents