Lightning-AI / litgpt

Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.

Home Page:https://lightning.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Test readme commands as part of the CI

rasbt opened this issue · comments

As suggested, let's add a CI workflow to test the code contained in the README.

I spent some time searching for a solution and unfortunately could find one.
All what I could find are tools that check the code-blocks by running them. Not sure that this is what we want.

The best what we can do, as I see it, is to write a custom code that parses markdown files, retrieves all the code within code-blocks that start with litgpt command and then asks jsonargparse to validate it (if it's even possible).

It looks like there's this project, but I am not sure if it is feasible given that some commands are rather expensive and require GPUs: https://github.com/modal-labs/pytest-markdown-docs

I'll look into it

this is going to get over-engineered.

the intention here was to make sure the commands use in the readme are tested… not to test the literal readme file.

just write a test that runs the exact same commands in CI. this shouldn’t take more than 5 minutes of time. @Andrei-Aksionov @rasbt

not to test the literal readme file.

I disagree. It would be nice to make sure that all the commands (that will be copy-pasted by newcomers) are correct ones, so #1295 or anything like it won't happen.
But, since there is no clean solution (like a github action) for it at the moment, I agree with

this is going to get over-engineered.


We can put all the commands from the Readme file in a test and call it a day, especially if we use a really small model like pythia-14m and limited amount of train/val steps, but:

  1. should we use a GPU CI for it (it's already suffering)?
  2. should all the tutorials be tested or only the main one (for now)?

this is about sequencing, not what is right or wrong. the right solution for the long term is obviously a robust one.

but we need to sequence our way there.

the main priority right now is unblock the ability for the commands in the readme to be stable and tested.