~loges/mychef-scraper

web scraper for fetching recipes
docs: add README to project repo
build: add some development scripts
feat: add crawler for thefullhelping.com

refs

master
browse  log 

clone

read-only
https://git.sr.ht/~loges/mychef-scraper
read/write
git@git.sr.ht:~loges/mychef-scraper

You can also use your local clone with git send-email.

#mychef-scraper


License: AGPL v3

Web scraper for fetching recipes and loading them into Mychef.

Supported crawlers:

To add additional crawlers, you just need to support the crawl and extract_recipe methods.

#Requirements

The following system dependencies are needed to run scraper:

#Configuration

The configuration values are based on a running instance of mychef:

MYCHEF_USERNAME=dummy
MYCHEF_PASSWORD=password
MYCHEF_API_URL=http://localhost:8000

#Setup

Install scraper dependencies:

./scripts/install

This script will download the ingredient model (if not installed) and required Python dependencies. The model is necessary for extracting individual ingredients from recipe texts.

#Scraping

Get help for running the scraper:

python -m scraper --help

Here is an example for triggering the full_helping crawler:

python -m scraper crawl --source full_helping

#License

mychef-scraper is distributed under the terms of the AGPLv3 license.