ManhTin / plant-recommender-datasource

Data Analysis and Dataset generation for the plant recommender backend

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Plant Recommender App

Datasourcing for the Homeplant Recommender System project in Data Integration 22/23 at WWU Münster

Data Sources

Cleaned Data

The cleaned dataset contains 1682 indoor and outdoor plants with following attributes:

Attribute Indoor Outdoor
active_growth_period - x
bloom_period - x
climate x -
common_name x x
difficulty x -
drought_tolerance x x
duration - x
family - x
family_common_name - x
flower_color - x
foliage_color x x
foliage_porosity_summer - x
foliage_porosity_winter - x
frost_free_days - x
fruit_color - x
growth_habit x x
growth_rate - x
height x x
humidity x -
image x ~
leaf_shape x x
lifespan - x
light x -
origin x x
ph_minimum - x
ph_maximum - x
scientific_name x x
temperature x -
toxicity x x
type x x
width x -

Installation and Setup

Getting started

Prerequisites

  • Python 3.10
    • We recommend to use miniforge (or miniconda). Since you can manage Python versions and virtual environments with it. Follow the setup steps below if you use miniforge (or miniconda)
    • after successful installation of miniforge (or miniconda) the command conda should be available
    • follow the setup guide from step 4 if you use another method of managing virtual environments
  • Setup project
    1. cd into this directory
    2. Create virtual environment for project: conda create --name plant-recommender-datasource python=3.10
    3. Activate virtual environment of project: conda activate plant-recommender-datasource
    4. Install dependencies into virtual env: pip install -r requirements.txt
    5. Start jupyter server: jupyter notebook
  • Run scraper
    1. howmanyplants.com scraper: python scrapers/how_many_plants_scraper.py -> exports to data/...

About

Data Analysis and Dataset generation for the plant recommender backend


Languages

Language:Jupyter Notebook 80.9%Language:Python 19.1%