ulixee / platform

Home of the Ulixee Open Data Platform

Home Page:https://ulixee.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ulixee

Ulixee is a scraping engine with a built-in deployment unit that enables out-of-the-box querying across a horizontal deployment.

This repository is the development home to several of the tools that make it easy to build and manage these scripts, including Ulixee Desktop, Cloud and Datastore.

Projects

  • Hero /hero. The Automated Browser Engine built for scraping. (repository home - https://github.com/ulixee/hero).
  • Datastore /datastore. Packaged "database" containing API access to crawler functions and extractor functions.
  • Cloud /cloud. Run Ulixee tooling on a remote machine.
  • Stream /stream. Query, transform and compose Datastores running on any machine.
  • Desktop /desktop. Supercharge scraper script development using a Hero Replay toolset, remote Datastore viewer and Error troubleshooter.

Tooling

Try out Ulixee Desktop! The Alpha release is available for download under Assets.

Docker

We publish a Docker image of the latest Ulixee Cloud to:

  • Github Container Registry: docker pull ghcr.io/ulixee/ulixee-cloud && docker tag ghcr.io/ulixee/ulixee-cloud ulixee/ulixe-cloud
  • DockerHub: docker pull ulixee/ulixee-cloud

To use the image, we have a run.sh script that will run with a non-root user on your choice of port. All environmental configurations are listed here.

Developer Environment

This project serves as a Monorepo for developing the Ulixee Datastore, Desktop, Hero and Cloud. To install this project, you'll need to:

  1. Clone with --recursive so that submodules are initialized.
  2. Run yarn build:all from the main repository.

Learn more about Ulixee at ulixee.org.

Contributing

See How to Contribute for ways to get started.

This project has a Code of Conduct. By interacting with this repository, organization, or community you agree to abide by its terms.

We'd love your help in making Ulixee a better set of tools. Please don't hesitate to send a pull request.

License

MIT

About

Home of the Ulixee Open Data Platform

https://ulixee.org/

License:MIT License


Languages

Language:TypeScript 70.2%Language:Vue 23.7%Language:Nearley 2.1%Language:JavaScript 1.9%Language:SCSS 1.2%Language:HTML 0.6%Language:Shell 0.2%Language:Dockerfile 0.1%Language:CSS 0.1%