Lightning-AI / lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to do conversation with fine tuned model?

Harsh-raj opened this issue · comments

I am successfully able to do fine tuning and inference of the llama model with the code in this repository. I just wanted to know how can I do conversation with the fine tuned model such that it is able to remember the context of previous queries and their responses. Also i want a context window to maintain so that model can discard older conversation prompt for newer prompts.