使用Puppeteer从iframe中的<p>标签提取文本

时间:2018-11-01 12:41:32

标签: javascript puppeteer

我在从iframe中提取文本时遇到了一些困难。

给出代码:

<iframe id="iframe" src="https://myamazingapp/2637374848489595" 
style="width:100%;border:none" height="858px" css="1"></iframe>
<div id="document">
   <html lang ="en">
      <head>...</head>
     <body onLoad="resize();">
         <div>class = "app">
             <div class = my-response">
             <div class="my-response__iconcontainer>..
        </div>
        <div class="my-response__copy">
           <p>Thank you for filling out my form</p>
        </div>
     </div>
</div>//....other members

我的Puppeteer片段如下:

  const frame = page.frames().find(frame => frame.name() === 'iframe');
  await frame.$('body > div > div > div > div.my-response__copy > p');
  const text = page.evaluate(el => el.innerHTML, await page.$('body > div > div > div > div.my-response__copy > p'));
  expect(text).to.equal('Thank you for filling out my form');

stacktrace开始于:

 { AssertionError: expected {} to equal 'Thank you for filling out my form.'
at changeOptions (/Users/firstname.lastname/Documents/projects/qa-tests/tests/widget/fillform.js:49:25)
at process._tickCallback (internal/process/next_tick.js:68:7)
message:
 'expected {} to equal \'Thank you for filling out my form\'',
 showDiff: true,
 actual: Promise { <pending> },
 expected: 'Thank you for filling out my form' }

有什么想法如何提取文本,以便我可以断言吗?

谢谢

2 个答案:

答案 0 :(得分:1)

page.evaluate()返回一个Promise

  

返回:<Promise<Serializable>> Promise解析为pageFunction的返回值

说实话,该消息包含以下指示:

actual: Promise { <pending> },

因此在某处的结果上使用await

const text = await page.evaluate(el => el.innerHTML, await page.$('body > div > div > div > div.my-response__copy > p'));
expect(text).to.equal('Thank you for filling out my form');

const text = page.evaluate(el => el.innerHTML, await page.$('body > div > div > div > div.my-response__copy > p'));
expect(await text).to.equal('Thank you for filling out my form');

我希望第一个。另外,我有一种印象,作为回报,await内的page.evaluate可能不是必需的。

答案 1 :(得分:0)

您可以从iframe中获取内容,并使用cheerio遍历元素并获取text / html或所需的其他任何东西。

示例:

const frame = page.frames().find(frame => frame.name() === 'iframe');
const content = await frame.content();
const $ = cheerio.load(content);
const p = $('p').text();
// the text of p
console.log(p);