luc4t / optimusprime

A web crawler/transformer (data aggregator) and an importer that handles inserting this data into various systems.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimus Prime

A web crawler that allows a user to easily scrape a sites content, transform the data into dynamic objects and then import it into a system via an API endpoint.


I wanted to demonstrate how easy it is to build a dynamic crawler tool using modern libraries such as crawlee & playwright. Crawlee handles all of the complex crawling logic for you to just worry about handling the aggregator of data from the user to retrieve and store.

The importer you can also modify to handle different auth cases incase the user doesn't authenticate through headers. This is a contextual representation of what you'd need to do to achieve posting all of the data objects you've created in your aggreagtor into whatever endpoint you want.

Without a specific use case to refine this script further I will park this at this current commit to act as a showcase.

Cheers, Dan

About

A web crawler/transformer (data aggregator) and an importer that handles inserting this data into various systems.


Languages

Language:JavaScript 100.0%