4g / therml

Blog for therml

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using data to cool data centers

Data centers consume 2–3% of worlds power¹. 30–50% of this power goes into keeping it cool². A system of different mechanisms works together to bring heat out from a datacenter and discard it into the atmosphere. These mechanims are controlled by their own local control systems. In this post, we detail how to control a system of systems more efficiently.

Problem

Why are they inefficient ?

  • Local controls
  • Tacit knowledge
  • Complex interaction
  • Difficult to model

Approach

Can we design a better control system ?

  • Data based modelling
  • Fixed point optimisation
  • Reinforcement learning on data model
  • Reinforcement learning directly on system
  • Continuous control

Let us try this on a simple simulator ?

  • Environment
    • Red balls are hot, blue balls are cold
    • Physics engine simulates motion of balls
    • Reward is given when all servers have cooled down
    • Time penalty for taking too long
    • Pymunk engine
  • Trpo agent
  • Results

Data center simulation

Solution

Modelling a real data center

  • Sensory data from a real DC. Glance into data, simple EDA.
  • Part based models
    • Time delay in action
    • LSTMs to simulate individual parts
    • Each part connected to another
    • Part connection graph
    • State Machine composed of these parts is our simulator
    • Controls, latent variables,
    • Accuracy of simulation
    • Model sanity check
  • By product, predictive maintenance

Simple optimisation on data model

  • Better setpoints according to weather
  • Reacting with a chiller instead of PAHUs

PUE optimization

RL policies

  • Action space of controls
  • Agent
  • Rewards
  • Results

Taking to production

System design

  • Client side push
  • Time series database
  • Log cuts for model training
  • Model updates using dependency tree
  • What is a policy and how to deploy one ?
  • Monitoring
  • Fallback and safety mechanisms

References

About

Blog for therml