QTextDocument:获取特定网页

时间:2018-03-14 16:24:24

标签: c++ qt qt5 qtextdocument

给定具有特定宽度和高度的QTextDocument,是否有办法在给定页码的情况下获取给定页面的内容(如果页面上有图像,则为纯文本+图像URL)?

以下是我想要实现的一个例子:

QString getTextForPage(int pageNumber); // this is the function I'd like to have
QString getURLForPage(int pageNumber); // this is the function I'd like to have

QString html = R"(
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <style>
    .summary {page-break-after: always}
  </style>
  <title>title</title>
</head>
  <p class="summary">This is a short summary which would fit into a page</p>
  <p><img src="www.example.com/test.png" height="100" width="200"></p>
  <p><img src"www.example.com/test2.png>" height="50" width="10"</p>
  <p>This is a short text which should fit into a page<p>
<body>

</body>
</html>
)"

const auto width = 100;
const auto height = 200;
auto textDoc = new QTextDocument();
textDoc->setHtml(html);
textDoc->setPageSize(QSizeF {wide, height});
textDoc->setDocumentMargin(0);

for (auto curPageNum = 1; curPageNum <= textDoc->pageCount(); ++curPageNum) {
    qDebug() << "current page: " << curPageNum;
    qDebug() << getTextForPage(curPageNum);
    qDebug() << getURLForPage(curPageNum);
}

这应该打印:

1
This is a short summary which would fit into a page
(empty string as there is no URL)
2
(empty string as the there is no text on the page)
www.example.com/test.png
3
This is a short text which should fit into a page
www.example.com/test2.png

通常,p标签中的文本可以跨越多个页面,并且图像保证最多可以跨越一页,以防有效。

1 个答案:

答案 0 :(得分:0)

不可能将特定网页的内容作为&#34;明文+图片网址&#34;。 Qt不支持这一点。特别是当您将内容设置为html时,Qt不会为您提取子HTML文档...

通过使用QPrinter并请求仅打印特定页面,您可以获得的内容是图像(或pdf)。我可以发布代码,说明如何为您提供可接受的解决方案。