我正在编写VBA,以使用XML从股票网站中提取表格,但无法提取标记内的内部文本。我可以知道我的代码有什么问题吗?
我试图通过使用xxxx.innerText获取内部文本,其中xxxx是MSHTML.IHTMLElement。
Sub ProcessHTMLPage(HTMLPage As MSHTML.HTMLDocument)
Dim htmlTable As MSHTML.IHTMLElement
Dim htmlTables As MSHTML.IHTMLElementCollection
Dim htmlrow As MSHTML.IHTMLElement
Dim htmlcell As MSHTML.IHTMLElement
Dim RowNum As Integer
Dim ColNum As Integer
Set htmlTables = HTMLPage.getElementsByTagName("table")
For Each htmlTable In htmlTables
If htmlTable.className = "table_list" Then
RowNum = 2
For Each htmlrow In htmlTable.getElementsByTagName("tr")
ColNum = 1
For Each htmlcell In htmlrow.Children
Debug.Print htmlcell.innerText
ColNum = ColNum + 1
Next htmlcell
RowNum = RowNum + 1
Next htmlrow
End If
Next htmlTable
End Sub
Sub GetPrice()
Dim XMLPage As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim URL As String
URL = "https://www.hkex.com.hk/Market-Data/Securities-Prices/Equities/Equities-Quote?sym=2&sc_lang=zh-hk"
XMLPage.Open "GET", URL, False
XMLPage.send
HTMLDoc.body.innerHTML = XMLPage.responseText
ProcessHTMLPage HTMLDoc
End Sub
我希望输出应为:
即日
52周
最高价
HK $ 90.700
HK $ 97.400
...
答案 0 :(得分:1)
页面动态加载该内容,这就是为什么对初始URL的xhr请求没有给您预期的结果的原因。
但是,您可以在浏览器的网络选项卡中查找,并找到页面用于通过单独的xhr更新内容的端点。
在下面,我不确定令牌是否基于时间,但是您可以进行探索。您需要剥离外部jquery字符串,但是您可以使用json解析器解析内部json。我使用jsonconverter.bas。您将jsonconverter.bas代码下载到名为JsonConverter的标准模块中,然后转到VBE>工具>引用>添加对Microsoft脚本运行时的引用。如果您不想使用json解析器(解析器应该是您的首选),则可以使用split
函数来提取您想要的信息。
我将在工作表中设置表格格式,并根据需要设置列标题和行标题(可能如网页上所示),并将单元格设置为HKD货币-然后将以下变量中的值分配给其中的单元格那张桌子。您可以绑定一个按钮,以便随意刷新工作表中的表格。
布局(和结果):
代码:
Option Explicit
Public Sub GetInfo()
Dim s As String, json As Object
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www1.hkex.com.hk/hkexwidget/data/getequityquote?sym=2&token=evLtsLsBNAUVTPxtGqVeG6jZbQlrF5FojHmveNua5GgTcjPuBcUs7GTV0hIBAgAI&lang=chi&qid=1560281438643&callback=jQuery311003616462678192556_1560281436567&_=1560281436568", False
.send
s = Split(Split(.responseText, "(")(1), ")")(0)
End With
Set json = JsonConverter.ParseJson(s)("data")("quote")
Dim sameDayHigh As Double, fiftyTwoWeekHigh As Double, sameDayLow As Double, fiftyTwoWeekLow As Double, timeInfo As String
sameDayHigh = json("hi")
sameDayLow = json("lo")
fiftyTwoWeekHigh = json("hi52")
fiftyTwoWeekLow = json("lo52")
timeInfo = json("updatetime")
With ThisWorkbook.Worksheets("Sheet1")
.Cells(1, 1) = timeInfo
.Cells(3, 2).Resize(1, 2) = Array(sameDayHigh, fiftyTwoWeekHigh)
.Cells(4, 2).Resize(1, 2) = Array(sameDayLow, fiftyTwoWeekLow)
End With
End Sub
以上的压缩版本:
Option Explicit
Public Sub GetInfo()
Dim s As String, json As Object
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www1.hkex.com.hk/hkexwidget/data/getequityquote?sym=2&token=evLtsLsBNAUVTPxtGqVeG6jZbQlrF5FojHmveNua5GgTcjPuBcUs7GTV0hIBAgAI&lang=chi&qid=1560281438643&callback=jQuery311003616462678192556_1560281436567&_=1560281436568", False
.send
s = Split(Split(.responseText, "(")(1), ")")(0)
End With
Set json = JsonConverter.ParseJson(s)("data")("quote")
With ThisWorkbook.Worksheets("Sheet1")
.Cells(1, 1) = json("updatetime")
.Cells(3, 2).Resize(1, 2) = Array(json("hi"), json("hi52"))
.Cells(4, 2).Resize(1, 2) = Array(json("lo"), json("lo52"))
End With
End Sub
答案 1 :(得分:0)
这与您的方法稍有不同,但是可以得到预期的结果,而不会循环浏览各行:
不过,以后可以将打印出的字符串写到列表中,因此,如果用新行将其隔开并删除空字符串,则可以使用它:
Sub TestMe()
Dim appIE As Object
Set appIE = CreateObject("InternetExplorer.Application")
With appIE
.navigate "https://www.hkex.com.hk/Market-Data/Securities-Prices/Equities/Equities-Quote?sym=2&sc_lang=zh-hk"
.Visible = False
End With
Do While appIE.Busy
DoEvents
Loop
Dim allData As Object
Set allData = appIE.document.getElementsByClassName("table_list")
Debug.Print allData.item.outerText
appIE.Close
End Sub