zhangzhengde0225 / Xiwu

Xiwu: A Large Lanauge Model for High Energy Physics

Home Page:https://ai.ihep.ac.cn

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Stars Open issue Datasets

English | 简体中文

xiwu logo HEP·Xiwu LLM

This is the first LLM for HEP, an offitial implemention of Xiwu(溪悟): A Basis Flexible and Learnable LLM for High Energy Physics. This model is designed to possess exceptional capabilities in common sense answering, BOSS code generation, and physical logical reasoning.

Xi(溪): stremlet → drops of water, Wu(悟): understand and gaining insight

Features

  • Xiwu, the first LLM specilized for high energy physics outperforms the foundation model in accuracy for domain-specific knowledge question answering, and exceeds GPT-4 in BOSS (BESIII Offline Software System) code
  • Xiwu is a Level 2 model that can smoothly switch between foundation models such as LLaMA, Vicuna, ChatGLM and Grok-1.
  • Xiwu equipped with two learning systems: The Just-In-Time Learning system based on RAG is capable of acquiring new knowledge instantly, and the On-The-Fly Traning system based on secondary pre-training and fine-tuning can be used to enhance the model's performance in specific tasks.

Quick Start

Install Dependencies

pip install -r requirements.txt

You can see the basic configurations in the configs.py and constant.py files.

Prepare trained Weights

By default, the model weights will be stored in the /data/<USERNAME>/weights directory, you set the PRETRAINED_WEIGHTS_DIR cont in the constant.py file or the PRETRAINED_WEIGHTS_DIR environment variable to change the default directory.

You can run ./prepare_weights.sh --list_all to see all available weights, and run the following command to download the trained weights:

./prepare_weights.sh --model lmsys/vicuna-7b-v1.5 

Deploy

Run CLI (Command Line Interface) to interact with the model

python run_cli.py \
  --model_path xiwu/xiwu-13b-16k-20240417 \
  --load_8bit False 

You and switch to any supported model. For more available arguments, you can run python run_cli.py -h. The assembler will automatically search the model in the PRETRAINED_WEIGHTS_DIR directory.

Deploy a worker to host an API server

python run_worker.py \
  --model_path xiwu/xiwu-13b-16k-20240417 \

For more available arguments, you can run python run_worker.py -h.

After the worker is started, you can open a new terminal and access the model via by following script:

python request_api.py

Note that you should specify the base_url in the script to the address of the worker. Streaming API is also supported in this script.

Train on Custom Data to get a new model

bash scripts/train_xiwu.sh 

Performance Comparison

Comparision

Comparsion of GPT-4 and Xiwu in HEP Kownledge Q&A and BOSS Code Generation

Contributors

If you are interested in contributing to Xiwu, please refer to the Contributing Guidelines.

Currently, Xiwu is authored by Zhengde Zhang, Yiyu Zhang, Haodong Yao, Jianwen Luo, Rui Zhao, Bo Huang, Jiameng Zhao, Yipu Liao, Ke Li, Lina Zhao, Fazhi Qi and Changzheng Yuan.

it is maintained by Zhengde Zhang (zdzhang@ihep.ac.cn).

Acknowledgements

This work is Supported by the Informatization Plan of Chinese Academy of Science, Grant No. CAS-WX2022SF-0104 and "From 0 to 1" Original Innovation Project of IHEP, Grant No. E3545PU2. We would like to express our gratitude to Beijiang Liu, Yaquan Fang, Gang Li, Wuming Luo, Ye Yuan, Shengsen Sun, Yi Jiao and others who are not listed here for engaging in beneficial discussions or providing computing resources.

We are very grateful to the LLaMA, FastChat projects for the foundation models.

Citation

@misc{zhang2024xiwu,
      title={Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics}, 
      author={Zhengde Zhang and Yiyu Zhang and Haodong Yao and Jianwen Luo and Rui Zhao and Bo Huang and Jiameng Zhao and Yipu Liao and Ke Li and Lina Zhao and Fazhi Qi and Changzheng Yuan},
      year={2024},
      eprint={2404.08001},
      archivePrefix={arXiv},
      primaryClass={hep-ph}
}

License

This project is licensed under the terms of the CC BY-NC-SA 4.0 license.

About

Xiwu: A Large Lanauge Model for High Energy Physics

https://ai.ihep.ac.cn

License:Other


Languages

Language:Python 98.0%Language:Shell 2.0%