PrantaDas / puppetter-web-scrapping

This is a Puppeteer script written in TypeScript for web scraping purposes. The script automates browser actions to interact with a website, solve reCAPTCHA challenges, and download a PDF file. It uses additional Puppeteer plugins for stealth and reCAPTCHA solving.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Puppeteer Web Scraping Script

This is a Puppeteer script written in TypeScript for web scraping purposes. The script automates browser actions to interact with a website, solve reCAPTCHA challenges, and download a PDF file. It uses additional Puppeteer plugins for stealth and reCAPTCHA solving.

Prerequisites

Before running the script, ensure you have the following installed and configured:

Installation

  1. Clone the repository:

    git clone https://github.com/PrantaDas/puppetter-web-scrapping.git
  2. Navigate to the project directory:

    cd puppetter-web-scrapping
  3. Install dependencies:

    pnpm install

Configuration

  1. Create a .env file in the root of the project.

  2. Add the following environment variables to the .env file:

    URL= https://www.gob.mx/curp  # Replace with the target URL
    IDENTIFIER= replace with the sample CURP or identifier
    CAPTCHA_TOKEN=your_captcha_token  # Replace with your 2Captcha token

Usage

Run the script using the following command:

pnpm dev

About

This is a Puppeteer script written in TypeScript for web scraping purposes. The script automates browser actions to interact with a website, solve reCAPTCHA challenges, and download a PDF file. It uses additional Puppeteer plugins for stealth and reCAPTCHA solving.

License:MIT License


Languages

Language:TypeScript 100.0%