A simple metrics client enabling fetching of metrics via http requests.
Limitless utility through Python user scipts.
This role is currently in early developement and is highly (possibly completely) unstable.
Use at your own risk. Or preferably, wait until it's "done".
- About
- Getting Started
- Deployment
- Usage
- Configuration Options
- Script Modules
- SSL
- Built Using
- TODO
- Contributing
- Authors
Generic Scraper is a low code basic web scraper driven by yaml/json config files
Clone the repo
cd /opt/
git clone https://github.com/camratchford/generic_scraper.git
Set up your venv
cd ./generic_scraper
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install .
Create a yaml config file, filling the variables with your own
# Will scrape job postings from a level.co job board
scraper_configs:
level:
container_el: div
container_attr: class
container_val: postings-wrapper
list_item_el: a
list_item_attr: "class"
list_item_val:
- posting-title
href: True
extract_items:
- name: title
tag: h5
attr: data-qa
val: posting-name
- name: link
tag: a
attr: href
val:
- name: location
tag: span
attr: class
val: sort-by-location
- name: team
tag: span
attr: class
val: sort-by-team
scraper_urls:
- url: https://jobs.lever.co/imperfectfoods
pagination: False
config: level
- url: https://jobs.lever.co/cfsenergy
pagination: False
config: level
Using the generic_scraper module
from generic_scraper.scraper import Scraper
from generic_scraper.extractor import Extractor
from generic_scraper.config import scraper_config
def main():
scraper_config.config_from_file(r"C:\Users\cameron\PyCharm Projects\level_scraper\tests\test.yml")
scraper = Scraper(scraper_config)
scraper.scrape()
extractor = Extractor(scraper_config)
extractor.extract()
data = extractor.serialize()
print(data)
if __name__ == "__main__":
main()
There are no tests
PEP-8
All items are susceptible to change at any moment. Don't use it.
As yet untested in production. Use at your own risk.
- @camratchford - Putting the pieces together
See also the list of contributors who participated in this project