Skip to content

mpolom/auntiesrecipes

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

auntiesrecipes

A searchable archive of BBC Food Recipes.

Homepage: https://www.auntiesrecipes.co.uk

Features I would like help with:

  • Write a scraper for other recipe sites ;)
  • Paginated / infinite-scrollable results
  • Download all the images and add them to the repo so I don't have to hotlink them
  • Better isVegetarian detection in the scraper
  • Search option based on prep/cooking/total time
  • More dietary requirements (vegan, gluten-free, etc)

Download tool

The download can be done through a shell script and wget or through a Node.js download app.

  • Sitemap.xml is pulled (this file is huge and contains each recipe)
  • Each HTML file is pulled at a rate of 3-10 at a time depending on config

Download Installation

npm install

Running the tool

Note: this will take a long time and creates over 1GB of stored text.

node index.js

Scraping tool

First run the download tool to gather all the HTML files from the BBC website.

Scraping Installation

Use npm to install dependencies such as Cheerio.

cd scraper
npm install

Running the scrape

cd scraper
node scrape.js

Optional: environmental variable RECIPE_DIR can be set.

About

A searchable archive of BBC Food Recipes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • JavaScript 80.9%
  • HTML 8.0%
  • Shell 7.7%
  • CSS 3.4%