Features

Puppeteer Scraping Tutorial


Puppeteer is a framework that allows you to control a headless browser through scripting.
The framework allows you to control a real (Chromium-based) browser, just like a normal user would.


This means it's useful for both automated testing, as well as scraping.
Scraping is an automated way to extract data from a website.


Always make sure you have permission to scrape!

Example

To get started, please see the example below where we will scrape some text from the TestingBot website.
This will start a new Chrome browser in the TestingBot browser grid
and instruct the browser to navigate to the TestingBot website and scrape the text from a specific DOM element.


const puppeteer = require('puppeteer')

const browser = await puppeteer.connect({
  browserWSEndpoint: 'wss://cloud.testingbot.com?key=api_key&secret=api_secret&browserName=edge&browserVersion=latest'
})

const page = await browser.newPage()
await page.goto('https://testingbot.com')
title = await page.evaluate(() => {
    return document.querySelector('body > div.main > div.hero.home > div > div > p').textContent.trim()
})
console.log(title);
browser.close()