Welcome to our exciting project where we are adapting two cutting-edge language models, Falcon-7B & LLAMA 2, to become proficient in Indian law.
Our adventure began with a modest 150 Q&As on Indian law. Now, we're charging ahead with an impressive dataset of 3300 instructions! This AI legal project combines:
- Falcon-7B & LLAMA 2: State-of-the-art language models, prepped and ready for legal training.
- PEFT & QLoRA: The dream duo for memory-efficient and high-performance model fine-tuning.
- Our Dataset: Comprehensive Indian law knowledge, spanning constitutional law, civil rights, and more!
![Dataset Creation (3)](https://private-user-images.githubusercontent.com/22457544/263538667-75d1b1a6-d467-4b94-b0b6-c09dde758a86.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MTg1MTksIm5iZiI6MTcyMTYxODIxOSwicGF0aCI6Ii8yMjQ1NzU0NC8yNjM1Mzg2NjctNzVkMWIxYTYtZDQ2Ny00Yjk0LWIwYjYtYzA5ZGRlNzU4YTg2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzIyVDAzMTY1OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTNjM2RhZmFkMzY0OWM4NTk3Njk4NjlkOGI4ZDJkNjY4ODBjYjk3ZmRmZjVhMjc4MjVhODQ5ZGY3NWVmYjk3YmQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.qoxuVkVyIFteS8ctBqAdp50-xM24MxFegQquCBGwV1M)
Our dataset is designed with four key features: instruction
, input
, output
, and prompt
. Crafted to shape our models into AI law experts!
Dataset on Hugging Face :
https://huggingface.co/datasets/nisaar/Constitution_Of_India_Instruction_Set
https://huggingface.co/datasets/nisaar/Articles_Constitution_3300_Instruction_Set
https://huggingface.co/datasets/nisaar/LLAMA2_Legal_Dataset_4.4k_Instructions
![Fine Tuning](https://private-user-images.githubusercontent.com/22457544/263591270-d08c4d51-0e5e-4cff-9fec-7d7a53959143.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MTg1MTksIm5iZiI6MTcyMTYxODIxOSwicGF0aCI6Ii8yMjQ1NzU0NC8yNjM1OTEyNzAtZDA4YzRkNTEtMGU1ZS00Y2ZmLTlmZWMtN2Q3YTUzOTU5MTQzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzIyVDAzMTY1OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTEzOGU3ODk2NjA5ZjA0ZDQ4YWEyODBiMzRmOGE4OWQ3NTlhZGI2ZGRkNmQ2MDIzOGRmZGM0YWIwZDRiYzk4ZTEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.d2tEELVHjfEHJZLrrLxvBms9oEm6YJ0UwnwFk-jBarI)
Get a front-row seat to the training progress with TensorBoard. Kickstart it, navigate to the provided localhost link, and witness the models learn: