romannamor9 / LLaMADemo

πŸŽ‰LLaMA Demo 7BπŸŽ‰

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LLaMA Demo of 7B

  • 🎟Model from: llama
  • πŸ€„Code from:pyllama
  • πŸ“ŒFAQ:FAQ

Run environ

  • single GPU: 16GB

Use

  1. Download the pretrained_model.

    • BitTorrent link: magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA
    • The final directory:
      .
      β”œβ”€β”€ inference.py
      β”œβ”€β”€ llama
      β”‚   β”œβ”€β”€ generation.py
      β”‚   β”œβ”€β”€ __init__.py
      β”‚   β”œβ”€β”€ model_parallel.py
      β”‚   β”œβ”€β”€ model_single.py
      β”‚   └── tokenizer.py
      β”œβ”€β”€ LLaMA
      β”‚   β”œβ”€β”€ 7B
      β”‚   β”‚   β”œβ”€β”€ checklist.chk
      β”‚   β”‚   β”œβ”€β”€ consolidated.00.pth
      β”‚   β”‚   └── params.json
      β”‚   β”œβ”€β”€ tokenizer_checklist.chk
      β”‚   └── tokenizer.model
      β”œβ”€β”€ requirements.txt
      └── webapp_single.py
      
  2. Install the related packages.

    pip install -r requirements.txt
  3. Run

    • Inference by scripts
      python inference.py
    • Run by Gradio UI
      python webapp_single.py --ckpt_dir LLaMA/7B \
                              --tokenizer_path LLaMA/tokenizer.model \
                              --server_name 127.0.0.1 \
                              --server_port 7806
  4. Gradio Result

    • Open http://127.0.0.1:7860 to enjoy it.

About

πŸŽ‰LLaMA Demo 7BπŸŽ‰

License:GNU General Public License v3.0


Languages

Language:Python 100.0%