(Experimental) Using Llama2 with Haystack
The notebook contains my hacky experiments in which I try to load and use Llama2 with Haystack, the NLP/LLM framework.
It's nothing official or well refined, but perhaps it may be useful to other people experimenting.
- Installed Transformers from the main branch (and other libraries) ๐
- Loaded Llama-2-13b-chat-hf on Colab using 4-bit quantizazion, thanks to the great material shared by Younes Belkada ๐
- Disabled Tensor Parallelism, which caused some issues ๐ ๏ธ
- Installed a minimal version of Haystack
- Found a hacky way to load the model in Haystack's PromptNode
- Had a llama-zing chat session, from ๐ง๐ถ David Guetta to Don Matteo โช๐ฟ (an Italian TV series)!