Puppeteer Web Scraping Script

This is a Puppeteer script written in TypeScript for web scraping purposes. The script automates browser actions to interact with a website, solve reCAPTCHA challenges, and download a PDF file. It uses additional Puppeteer plugins for stealth and reCAPTCHA solving.

Prerequisites

Before running the script, ensure you have the following installed and configured:

Node.js and npm: Download and install Node.js
Git: Download and install Git

Installation

Clone the repository:

git clone https://github.com/PrantaDas/puppetter-web-scrapping.git

Navigate to the project directory:
```
cd puppetter-web-scrapping
```
Install dependencies:
```
pnpm install
```

Configuration

Create a .env file in the root of the project.

Add the following environment variables to the .env file:

URL= https://www.gob.mx/curp  # Replace with the target URL
IDENTIFIER= replace with the sample CURP or identifier
CAPTCHA_TOKEN=your_captcha_token  # Replace with your 2Captcha token

Usage

Run the script using the following command:

pnpm dev

About

This is a Puppeteer script written in TypeScript for web scraping purposes. The script automates browser actions to interact with a website, solve reCAPTCHA challenges, and download a PDF file. It uses additional Puppeteer plugins for stealth and reCAPTCHA solving.

MIT License

Languages

Language:TypeScript 100.0%