Skip to content

laraminones/recipes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 

Repository files navigation

recipes

This project is a search engine of cooking recipes, made for academic and testing purposes.

By using Elasticsearch, Scrapy, Django and Python we provide some search functionalities over a reduced collection of cooking recipes.

In order to do that and once the prerequisites are met, we crawl a cooking recipes site (recipetineats.com) using scrapy. Then we load the obtained data in Elasticsearch using a Python script. Finally, we provide the search and visualization functionalities through a web app made in Python. This app retrieves Elasticsearch's data by using query and aggregation methods the elasticsearch_dsl library provides.

The following prerequisites and execution instructions have been considered for a Ubuntu 20.04 LTS system.

Prerequisites

  • Install Python3.
$ sudo apt-get install python3
  • Install pip.
$ sudo apt-get install python3-pip
$ sudo apt-get install python3 python3-dev python3-pip libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev
$ pip3 install Scrapy
  • Install Elasticsearch
$ sudo apt-get install apt-transport-https
$ echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-7.x.list
$ sudo apt-get update && sudo apt-get install elasticsearch
  • Install Django (guide here) and elasticsearch_dsl library.
$ pip3 install Django==3.1.3
$ pip3 install elasticsearch-dsl

Execution

  • Execute Scrapy spider (this step is not really needed since we did upload the recipes.jsonlines file to the repo). If you still want to re-run the crawling process go to recipes/ folder, delete de recipes.jsonlines and execute the following command.
$ scrapy crawl recipes -o recipes.jsonlines:jsonlines
  • Launch Elasticsearch as service
$ sudo systemctl start elasticsearch.service
  • Load crawled data into Elastic, using the index_loader.py Python script. This file is placed in recipes/
$ python3 index_loader.py

Notice the recipes.jsonlines file needs to be in the same folder the Python script is.

  • Launch the Django app.
$ python3 manage.py runserver

Try the app

Navigate to localhost:8000/ on your prefered browser too see the app main page.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published