dspray95 / open-recipe

Gathering an Open Recipe Dataset

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Open Recipe

Generating an open recipe dataset with Python, Scrapy, and xpath.

Usage

For default usage run __main__.py. This will create a json file containing recipe details in the directory /open-recipe/data/output.

For more advanced usage create a Controller object. The controller object has two arguments - verbose and sample. If verbose is true further detailed scrapy text will be output to the console. If sample is greater than 0, the crawler will only run for n=sample urls. This is used mostly for testing purposes

Support

Currently only supports recipes from BBC Good Food.

Acknowledgements

Initial csv containing the list of URLs taken from /u/draeg82 (Twitter @givemearecipe) (https://www.reddit.com/r/datasets/comments/an6n26/are_there_any_freetouse_or_opensource_recipe/)

About

Gathering an Open Recipe Dataset

License:The Unlicense


Languages

Language:Python 95.6%Language:C 2.6%Language:XSLT 0.6%Language:C++ 0.6%Language:HTML 0.3%Language:Objective-C 0.1%Language:GAP 0.1%Language:Roff 0.0%Language:ASP 0.0%Language:JavaScript 0.0%Language:PowerShell 0.0%Language:Shell 0.0%Language:Batchfile 0.0%Language:Visual Basic 0.0%