YongGuCheng / PyGrid

A Peer-to-peer Platform for Secure, Privacy-preserving, Decentralized Data Science

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PyGrid logo

Binder Run Tests Docker build Chat on Slack FOSSA Status

PyGrid is a peer-to-peer network of data owners and data scientists who can collectively train AI models using PySyft.

Overview

Architecture

PyGrid platform is composed by three different components.

PyGrid App - A Flask based application used to manage/monitor/control and route grid Nodes/Workers remotely.
Grid Nodes - Server based apps used to store and manage data access in a secure and private way.
Grid Workers - Client based apps that uses different Syft based libraries to perform federated learning (ex: syft.js, KotlinSyft, SwiftSyft).

Getting started

To boot the entire PyGrid platform locally, we will use docker containers. To install docker the dependencies, just follow docker documentation.

Start Grid platform locally

1 - Using Docker

The latest PyGrid Gateway and Node images are available on the Docker Hub.

  • PyGrid - openmined/grid-gateway
  • Grid Node - openmined/grid-node
1.1 - Setting the Domain Names

Before start the grid platform locally using docker, we need to set up the domain names used by the bridge network. In order to use these nodes from outside of containers context, you should add the following domain names on your /etc/hosts

127.0.0.1 gateway
127.0.0.1 bob
127.0.0.1 alice
127.0.0.1 bill
127.0.0.1 james

1.2 - Run Docker Images

To setup and start the PyGrid platform you just need start the docker-compose process.

$ docker-compose up

It will download the latest openmined's docker images and start a grid platform with 1 gateway and 4 grid nodes.
PS: Feel free to increase/decrease the number of initial PyGrid nodes (you can do this by changing the docker-compose.yml file).

1.3 - Build your own images (Optional)

$ docker build -t openmined/grid-node ./app/websocket/  # Build PyGrid node image
$ docker build -t openmined/grid-gateway ./gateway/  # Build gateway image

2 - Starting manually

To start the PyGrid app manually, run:

python grid.py 

You can pass the arguments or use environment variables to set the gateway configs.

Arguments

  -h, --help                shows the help message and exit
  -p [PORT], --port [PORT]  port to run server on (default: 5000)
  --host [HOST]             the grid gateway host
  --num_replicas            the number of replicas to provide fault tolerance to model hosting
  --start_local_db          if this flag is used a SQLAlchemy DB URI is generated to use a local db

Environment Variables

  • GRID_GATEWAY_PORT - Port to run server on.
  • GRID_GATEWAY_HOST - The grid gateway host
  • NUM_REPLICAS - Number of replicas to provide fault tolerance to model hosting
  • DATABASE_URL - The gateway database URL
  • SECRET_KEY - The secret key

For development purposes

You can also start the PyGrid app by running the dev_server.sh script.

$ ./dev_server.sh

This script uses the dev_server.conf.py as configuration file, including some gunicorn preferences and environment variables. The file is pre-populated with the default environment variables. You can set them by editing the following property:

raw_env = [
    'PORT=5000',
    'SECRET_KEY=ineedtoputasecrethere',
    'DATABASE_URL=sqlite:///databasegateway.db',
]

Kubernetes deployment.

You can now deploy the PyGrid app and Grid Node docker containers on kubernetes. This can be either to a local (minikube) cluster or a remote cluster (GKE, EKS, AKS etc). The steps to setup the cluster can be found in ./k8s/Readme.md

Try out the Tutorials

A comprehensive list of tutorials can be found here.

These tutorials cover how to create a PyGrid node and what operations you can perform.

Start Contributing

The guide for contributors can be found here. It covers all that you need to know to start contributing code to PyGrid in an easy way.

Also join the rapidly growing community of 7300+ on Slack. The slack community is very friendly and great about quickly answering questions about the use and development of PyGrid/PySyft!

We also have a Github Project page for a Federated Learning MVP here.
You can check the PyGrid's official development and community roadmap here.

High-level Architecture

High-level Architecture

Disclaimer

Do NOT use this code to protect data (private or otherwise) - at present it is very insecure.

License

Apache License 2.0

About

A Peer-to-peer Platform for Secure, Privacy-preserving, Decentralized Data Science

License:Apache License 2.0


Languages

Language:Python 96.1%Language:Shell 1.4%Language:HTML 1.4%Language:Makefile 0.6%Language:Dockerfile 0.3%Language:JavaScript 0.2%Language:CSS 0.1%