This data set includes about 3000 versions of the source code of 2415 malicious packages.
package name -> version -> source code zip file.
Example:
ython-binance -> 0.1 -> ython-binance-0.1.tar.gz
This dataset is the work of the ASE 2023 paper "An Empirical Study of Malicious Code In PyPI Ecosystem"
@misc{guo2023empirical,
title={An Empirical Study of Malicious Code In PyPI Ecosystem},
author={Wenbo Guo and Zhengzi Xu and Chengwei Liu and Cheng Huang and Yong Fang and Yang Liu},
year={2023},
eprint={2309.11021},
archivePrefix={arXiv},
primaryClass={cs.SE}
}