mlcommons / inference

Reference implementations of MLPerf™ inference benchmarks

Home Page:https://mlcommons.org/en/groups/inference

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reduce the dataset size for LLAMA2

arjunsuresh opened this issue · comments

I believe there is no need to run LLAMA2-70B model for 24576 samples. GPT-J-6B is run for 13368 samples, Stable diffusion is run for 5000 samples, 3d-unet is run for 43 samples and all other MLPerf inference models are way faster. In fact, LLAMA2-70B is at least an order of magnitude slower than any other MLPerf inference model. For coming rounds, we should reduce the LLAMA2-70B dataset size to some 4-digit number.

Another idea would be to create a representative subset for performance runs, i.e. with sample distribution similar to the original dataset, but only 1/10th the size.

@psyhtest Thats a good option. But in LLAMA2 case the dataset is already a subset selection. So, it might be easier to reduce the dataset size itself though this means a new accuracy threshold.
If we use a different dataset for performance runs, TEST01 might be complicated when it gets introduced for LLAMA2.