souravcipher / EnvisEdge

Deploy recommendation engines with Edge Computing

Home Page:https://www.nimbleedge.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


EnvisEdge EnvisEdge
Bringing Recommendations to the Edge

Lisence Activity Discord OpenIssues

Sparkline

A one-stop solution to build your recommendation models, train them and, deploy them in a privacy-preserving manner-- right on the users' devices.

EnvisEdge allows you to easily explore new federated learning algorithms and deploy them into production.

The steps to building an awesome recommendation system are:

  1. ๐Ÿ”ฉ Standard ML training: Pick up any ML model and benchmark it using standard settings.
  2. ๐ŸŽฎ Federated Learning Simulation: Once you are satisfied with your model, explore a host of FL algorithms with the simulator.
  3. ๐Ÿญ Industrial Deployment: After all the testing and simulation, deploy easily using NimbleEdge suite
  4. ๐Ÿš€ Edge Computing: Leverage all the benefits of edge computing

Repo Structure ๐Ÿข

NimbleEdge/EnvisEdge
โ”œโ”€โ”€ CONTRIBUTING.md           <-- Please go through the contributing guidelines before starting ๐Ÿค“
โ”œโ”€โ”€ README.md                 <-- You are here ๐Ÿ“Œ
โ”œโ”€โ”€ docs                      <-- Tutorials and walkthroughs ๐Ÿง
โ”œโ”€โ”€ experiments               <-- Recommendation models used by our services
โ””โ”€โ”€ fedrec                    <-- Whole magic takes place here ๐Ÿ˜œ 
     โ”œโ”€โ”€ communications          <-- Modules for communication interfaces eg. Kafka
     โ”œโ”€โ”€ multiprocessing         <-- Modules to run parallel worker jobs
     โ”œโ”€โ”€ python_executors        <-- Contains worker modules eg. trainer and aggregator
     โ”œโ”€โ”€ serialization           <-- Message serializers
     โ””โ”€โ”€ utilities               <-- Helper modules
โ”œโ”€โ”€ fl_strategies             <-- Federated learning algorithms for our services.
โ””โ”€โ”€ notebooks                 <-- Jupyter Notebook examples

QuickStart

Let's train Facebook AI's DLRM on the edge. DLRM has been a standard baseline for all neural network based recommendation models.

Clone this repo and change the argument datafile in configs/dlrm_fl.yml to the above path.

git clone https://github.com/NimbleEdge/EnvisEdge
model :
  name : 'dlrm'
  ...
  preproc :
    datafile : "<Path to Criteo>/criteo/train.txt"
 

Install the dependencies with conda or pip

mkdir env
cd env
virtualenv envisedge 
source envisedge/bin/activate 
pip3 install -r requirements.txt

Download kafka from Here ๐Ÿ‘ˆ and start the kafka server using the following commands

bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties

Create kafka topics for the job executor

bin/kafka-topics.sh --create --topic job-request-aggregator --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
bin/kafka-topics.sh --create --topic job-request-trainer --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
bin/kafka-topics.sh --create --topic job-response-aggregator --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
bin/kafka-topics.sh --create --topic job-response-trainer --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

To start the multiprocessing executor run the following command:

python executor.py --config configs/dlrm_fl.yml

Change the path in Dlrm_fl.yml to your data path.

preproc :
    datafile : "<Your path to data>/criteo_dataset/train.txt"

Run data preprocessing with preprocess_data and supply the config file. You should be able to generate per-day split from the entire dataset as well a processed data file

python preprocess_data.py --config configs/dlrm_fl.yml --logdir $HOME/logs/kaggle_criteo/exp_1

Begin Training

python train.py --config configs/dlrm_fl.yml --logdir $HOME/logs/kaggle_criteo/exp_3 --num_eval_batches 1000 --devices 0

Run tensorboard to view training loss and validation metrics at localhost:8888

tensorboard --logdir $HOME/logs/kaggle_criteo --port 8888

Contribute

  1. Please go through our CONTRIBUTING guidelines before starting.
  2. Star, fork, and clone the repo.
  3. Do your work.
  4. Push to your fork.
  5. Submit a PR to NimbleEdge/EnvisEdge

We welcome you to the Discord for queries related to the library and contribution in general.

About

Deploy recommendation engines with Edge Computing

https://www.nimbleedge.ai

License:Apache License 2.0


Languages

Language:Python 78.9%Language:Scala 19.8%Language:Jupyter Notebook 1.3%Language:Shell 0.1%