robert-s-lee / lit_elastic_trainer

run a big batch job in parallel

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A Lightning App demo of hierarchical jobs with interactive control on parallelism. Dunamic research flow and ealstic production pipeliens are possible with Lighting App.

Techniques demonstrated:

  • Flows within Flow..
  • Flow with dynamic number of Works within.
  • Works starting and shutting dynamically.
  • A work starts when there is something to run.
  • Work terminates after specified number of idle time to save cost.
  • Work has active terminal while running for debugging.
  • UI is dynamically generated.

The Demo Arch

A Flow has two two Flows within it.

graph TD
Flow1
Flow2
Flow1 -->|>3 iters completed| Flow3
Flow2 -->|>6 iters completed| Flow3
subgraph Start if Conditions are met
    Flow3
end
Loading

Each Flow has elastic (scale out and scale in) number of Works.

graph TD
UI --> |Interactive Control During the Run| Run
Run --> Work0
Run --> Work1
Run --> WorkN
subgraph Number increase and decrease dynamically  
		Work0
		Work1
		WorkN
end
Loading

Screen shots

Work has active terminal while running for debugging.

Terminal

A work starts when there is something to run.

job control

It terminates after specified number of idle time to save cost.

terminate

UI is dynamically generated. dyamic

About

run a big batch job in parallel

License:MIT License


Languages

Language:Python 100.0%