Add epochs to mlx_lora_trainer

Question

Add epochs to mlx_lora_trainer

ai-made-approachable opened this issue 4 months ago · comments

For most users it would be much easier to just configure epochs + batch-size and auto-calculate the amount of iterations based on the amount of training data.

GPT-4 solution:

Definitions

Total Number of Examples (N): The total number of examples in your training dataset.
Batch Size (B): The number of examples processed in one iteration (or step) of training.
Number of Epochs (E): The total number of times the training process will work through the entire dataset.
Total Number of Iterations (I): The total number of iterations (or steps) needed to complete the specified number of epochs.

Formula to Calculate Total Number of Iterations
The total number of iterations needed to complete the training can be calculated with the following formula:

This formula works under the assumption that N/B divides evenly. If N/B does not divide evenly (i.e., if there is a remainder), the actual number of iterations will be slightly higher, as the last batch of each epoch will be smaller than the specified batch size but still counts as a separate iteration.

Example Calculation
Suppose you have the following:

Total Number of Examples (N): 682
Batch Size (B): 32
Number of Epochs (E): 2
The calculation would be:

The calculation results in 42.625 total iterations, indicating that you would need 43 iterations to complete 2 epochs, as you can't have a fraction of an iteration. The presence of a fractional part (.625) indicates that the last batch in each epoch will be smaller than the specified batch size of 32.

This means, to complete 2 epochs with a batch size of 32 over a dataset of 682 examples, you would conduct 43 iterations, where the last iteration of each epoch processes fewer examples to cover the entire dataset.