Web scraper for hello fresh
Hello Fresh's API requires specific authorization headers in order to use it. Therefore, we must find a less technical way of fetching recipe data.
I wrote a copy-pastable script (get_urls.js) that could be put into the developer console. It continues to press the "Load More" Button, which loads 8 recipe cards. After no more cards show up on a button press, we collect all of the URLs and download it as a JSON file.
With each Recipe Page URL, I found an excerpt of JSON inside the HTML that provided all of the details for recipes. Using the command: npm start
, I executed my script (src/index.js) to hit all of the URLs I just got and JSDOM all of the JSON data.
The full JSON of URLs is inside urls.json
. The full JSON of Recipes is inside recipes.json
.