HSF / PyHEP.dev-workshops

PyHEP Developer workshops

Home Page:https://indico.cern.ch/e/PyHEP2023.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Best practices for collaborative ML R & D: How to structure frameworks and collaboration

klieret opened this issue · comments

Examples of challenges/discussion points:

Technological aspects:

  1. How can we cut boilerplate and standardize interfaces so that people can focus on developing models without sacrificing "hackability". Pytorch lightning is a popular option for pytorch, but IMO the way it is laid out by default has its own challenges (and might lead to duplicated code)
  2. How can we share results between the collaborators and bring everyone "on the same page" (for example using weights & biases)

Social aspects:

  1. How can we make sure to move in the same direction without constraining ourselves? How do we keep everyone engaged in building a common framework and avoid people "branching off forever".
  2. How do we balance more technical SW development work with model development? A lot of people want to focus on developing their model; few people want to work on framework issues. A good collaboration needs both.

I originally suggested this as a subtopic for #6 (doing open source). It also overlaps with #1 (packaging), #5 (fitting), and #19 (ML workflows for analysis). However, I think the challenges are very distinct because this targets development and R & D, rather than use in production/integration with other tools (for example, backwards compatibility isn't as big of an issue as is allowing for creativity).

This has a large overlap in themes with #19. Usefully different scope and kinds of requirements though!

Yes, I was thinking about this too, but the title of #19 led me to believe that its mainly about ML Ops and facilities (?).

User interface necessarily must deal with collaboration and frameworks.

Live notes

ML R & D Breakout session (Tuesday)

Present: Philip, Kilian, Richa, Raghav, Josue, Mike

Some of the questions that were discussed:

  • What frameworks do people use (lightning & friends)?
    • pytorch lightning
    • ML Flow might also do some things that lightning does
    • Onnyx for plugging in ML in other frameworks/model exchange
  • Dashboards (wandb & friends)?
    • ML Flow
    • Weights & Biases

Projects that were mentioned:

Conclusions:

  • Dashboards like (W&B / ML Flow) are a good way to bring people "on the same page" and compare/review/debug performance
  • Frameworks that are built around hooks and plugin/callback structure are a good way to allow extensibility without growing "Dinosaur classes. For example, lightning hooks like on_validation_epoch_ends allow you to write callbacks to do stuff at the end of an epoch rather than sublassing/modifying your class