Deep Q Learning for Aggregate Computing Program Scheduling
This repository showcases an experiment on Deep Q Learning applied to schedule collective computation, specifically for aggregate computing programs.
Structure
The system consists of N nodes, each executing a local aggregate program
Dynamics
Each node constructs a local state
Deep Q Learning Mode
The scheduling policy
-
State space 1: The state space comprises the last
$w$ states of the node, where$w$ represents the window size. - State space 2: The state space includes the current state of the node and the previous state of the neighbourhood.
The approach is similar to the one presented in the initial contributions of QL for scheduling. Rather than manually crafting the state space using -1, 0, and 1 values to represent increasing, stable, or decreasing trends of the local output, I employed a neural network to learn the trend of the local output. However, the latter approach did not yield satisfactory results.
Scenario 1: Gradient with changing source
In this scenario, the global program shared consists in computing the gradient of a source. The source node change its value after 100 seconds (the total time is 200 seconds). The initial source is selected randomly.