代码之家  ›  专栏  ›  技术社区  ›  Nicholas DiPiazza

有没有办法让Puppeter的waitUntil“networkidle”只考虑XHR(ajax)请求?

  •  6
  • Nicholas DiPiazza  · 技术社区  · 8 年前

    我正在使用Puppeter评估测试应用程序中基于javascript的网页HTML。

    这是我用来确保加载所有数据的行:

    await page.setRequestInterception(true);
    page.on("request", (request) => {
      if (request.resourceType() === "image" || request.resourceType() === "font" || request.resourceType() === "media") {
        console.log("Request intercepted! ", request.url(), request.resourceType());
        request.abort();
      } else {
        request.continue();
      }
    });
    try {
      await page.goto(url, { waitUntil: ['networkidle0', 'load'], timeout: requestCounterMaxWaitMs });
    } catch (e) {
    
    }
    

    这是最好的等待方式吗 ajax请求 是否要完成?

    感觉不错,但我不确定是否应该使用networkidle0、networkidle1等?

    3 回复  |  直到 8 年前
        1
  •  9
  •   Julien TASSIN    7 年前

    您可以使用 pending-xhr-puppeteer ,一个lib,它公开一个承诺,等待所有挂起的xhr请求得到解决。

    使用方法如下:

    const puppeteer = require('puppeteer');
    const { PendingXHR } = require('pending-xhr-puppeteer');
    
    const browser = await puppeteer.launch({
      headless: true,
      args,
    });
    
    const page = await browser.newPage();
    const pendingXHR = new PendingXHR(page);
    await page.goto(`http://page-with-xhr`);
    // Here all xhr requests are not finished
    await pendingXHR.waitForAllXhrFinished();
    // Here all xhr requests are finished
    

    免责声明 :我是待定xhr木偶师的维护者

        2
  •  6
  •   Everettss    8 年前

    XHR本质上可以稍后出现在应用程序中。任何 networkidle0 如果应用程序在例如1秒后发送XHR,并且您想要等待它,则不会帮助您。我认为,如果你想“正确地”完成这项工作,你应该知道你在等待什么请求,并且 await 为了他们。

    下面是一个应用程序中稍后出现的XHR示例,它将等待所有XHR:

    const puppeteer = require('puppeteer');
    
    const html = `
    <html>
      <body>
        <script>
          setTimeout(() => {
            fetch('https://swapi.co/api/people/1/');
          }, 1000);
    
          setTimeout(() => {
            fetch('https://www.metaweather.com/api/location/search/?query=san');
          }, 2000);
    
          setTimeout(() => {
            fetch('https://api.fda.gov/drug/event.json?limit=1');
          }, 3000);
        </script>
      </body>
    </html>`;
    
    // you can listen to part of the request
    // in this example I'm waiting for all of them
    const requests = [
        'https://swapi.co/api/people/1/',
        'https://www.metaweather.com/api/location/search/?query=san',
        'https://api.fda.gov/drug/event.json?limit=1'
    ];
    
    const waitForRequests = (page, names) => {
      const requestsList = [...names];
      return new Promise(resolve =>
         page.on('request', request => {
           if (request.resourceType() === "xhr") {
             // check if request is in observed list
             const index = requestsList.indexOf(request.url());
             if (index > -1) {
               requestsList.splice(index, 1);
             }
    
             // if all request are fulfilled
             if (!requestsList.length) {
               resolve();
             }
           }
           request.continue();
         })
      );
    };
    
    
    (async () => {
      const browser = await puppeteer.launch();
      const page = await browser.newPage();
      await page.setRequestInterception(true);
    
      // register page.on('request') observables
      const observedRequests = waitForRequests(page, requests);
    
      // await is ignored here because you want to only consider XHR (ajax) 
      // but it's not necessary
      page.goto(`data:text/html,${html}`);
    
      console.log('before xhr');
      // await for all observed requests
      await observedRequests;
      console.log('after all xhr');
      await browser.close();
    })();
    
        3
  •  1
  •   ggorlen Hoàng Huy Khánh    5 年前

    我同意 this answer 正在等待的 全部的 停止网络活动(“所有数据都已加载”)是一个相当模糊的概念,完全取决于您正在删除的网站的行为。

    检测响应的选项包括等待固定持续时间、网络流量空闲后的固定持续时间、特定响应(或一组响应)、元素出现在页面上、谓词返回true等,所有这些 Puppeteer supports

    考虑到这一点,最典型的场景是,您正在等待来自已知(或部分已知,使用某种模式或前缀)资源URL的某个特定响应或一组响应,这些响应将传递您想要读取的负载和/或触发您需要检测的DOM交互。木偶师优惠 page.waitForResponse 感谢你这么做。

    下面是一个示例,基于 existing answer (并说明如何在我们进行时从响应中检索数据):

    const puppeteer = require("puppeteer");
    
    const html = `
    <html>
      <body>
        <script>
          setTimeout(() => {
            fetch("http://jsonplaceholder.typicode.com/users/1");
          }, 1000);
          setTimeout(() => {
            fetch("http://jsonplaceholder.typicode.com/users/2");
          }, 2000);
          setTimeout(() => {
            fetch("http://jsonplaceholder.typicode.com/users/3");
          }, 3000);
          setTimeout(() => {
            // fetch something irrelevant to us
            fetch("http://jsonplaceholder.typicode.com/users/4");
          }, 0);
        </script>
      </body>
    </html>`;
    
    (async () => {
      const browser = await puppeteer.launch();
      const [page] = await browser.pages();
      await page.setContent(html);
      const expectedUrls = [
        "http://jsonplaceholder.typicode.com/users/1",
        "http://jsonplaceholder.typicode.com/users/2",
        "http://jsonplaceholder.typicode.com/users/3",
      ];
    
      try {
        const responses = await Promise.all(expectedUrls.map(url =>
          page.waitForResponse(
            response => response.url() === url, 
            {timeout: 5000}
          )
        ));
        const data = await Promise.all(
          responses.map(response => response.json())
        );
        console.log(data);
      }
      catch (err) {
        console.error(err);
      }
    
      await browser.close();
    })()