icloud-ecnu's repositories

igniter

iGniter, an interference-aware GPU resource provisioning framework for achieving predictable performance of DNN inference in the cloud.

lambdadnn

λDNN is a cost-efficient function resource provisioning framework to minimize the monetary cost and guarantee the performance for DDNN training workloads in serverless platforms.

Language:PythonStargazers:22Issues:1Issues:0

Tetris

Tetris, a model predictive control (MPC)-based container scheduling strategy to judiciously make migration decisions for long-running containerized workloads. Tetris can achieve the long-term optimization of container scheduling to circumvent invalid migrations as many as possible, by jointly optimizing the cluster load balancing and container migration cost over a certain sliding time window.

Opara

Opara is a lightweight and resource-aware DNN Operator parallel scheduling framework to accelerate the execution of DNN inference on GPUs.

Language:PythonStargazers:16Issues:2Issues:0

Prophet

Prophet is a predictable communication scheduling strategy to schedule the gradient transfer in an adequate order, with the aim of maximizing the GPU and network resource utilization.

Language:PythonStargazers:16Issues:1Issues:0

spotDNN

spotDNN is a heterogeneity-aware spot instance provisioning framework to provide predictable performance for DDNN training workloads in the cloud.

Language:PythonLicense:MITStargazers:16Issues:0Issues:0

delaystage

DelayStage is a simple yet effective stage delay scheduling strategy to interleave the cluster resources across the parallel stages, so as to increase the cluster resource utilization and speed up the job performance.

Language:ScalaStargazers:15Issues:0Issues:0

ebrowser

ebrowser, an energy-efficient and lightweight human interaction framework without degrading the user experience in mobile Web browsers.

Language:C++Stargazers:13Issues:0Issues:0
Language:Jupyter NotebookStargazers:12Issues:0Issues:0

ispot

iSpot is a lightweight and cost-effective instance provisioning framework for Directed Acyclic Graph (DAG)-style big data analytics, in order to guarantee the application performance on cloud transient servers (i.e., EC2 spot instances, GCE preemptible instances) while minimizing the budget cost.

Language:ScalaStargazers:12Issues:0Issues:0

paper-reading-list

Reading paper list for iCloud group

Stargazers:12Issues:0Issues:0
Language:JavaStargazers:9Issues:1Issues:0

k8s_primary

How to run the distributed TensorFlow in a Kubernetes cluster

Language:PythonStargazers:8Issues:0Issues:0
License:MITStargazers:0Issues:2Issues:0