yegcjs / mixinglaws

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mixinglaws

Code and data for "Data Mixing Laws: Optimizing Data Mixture by Predicting Language Modeling Performance"

Citation

@article{ye2024datamixinglaws,
  title={Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance},
  author={Ye, Jiasheng and Liu, Peiju and Sun, Tianxiang and Zhou, Yunhua and Zhan, Jun and Qiu, Xipeng},
  journal={arXiv preprint arXiv:2403.16952},
  year={2024}
}

About


Languages

Language:Jupyter Notebook 99.7%Language:Python 0.3%Language:Shell 0.0%