johnnyknoxville1337 / legal-ease

Your personal legal assistant

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Legal-ease Logo

A suite of NLP tools to simplify legal documents

What do employment agreements, contracts, lease documents, patents and licenses have in common?

Apart from the fact that they're all legal documents:

  1. They tend to have complex sentence structure and vocabulary choices that aren't accessible to people only familiar with conversational English
  2. Are difficult to comprehend for non-native speakers of the language they are written in
  3. Can run into several tens of pages (if not more)

Legal-ease addresses these issues using three tools:

  1. QnA over legal documents: Copy your document and ask it questions. Useful whether you have questions about the document as a whole or a specific clause.

  2. Document summarization: Generate a summary of the document. Options include changing the length of the summary (small, medium or large) and a choice between paragraphs or bullets.

  3. Multi & Cross-lingual document search: Perform cross-lingual semantic search over a collection of legal documents. This is currently a showcase feature allowing the user to perform keyword as well as semantic search over a collection of COVID-19 pandemic legislative documents and returns the top-3 document matches. Also features the option to translate into other languages [currently English-only].

Installation:

  1. Create a free-tier Cohere account and set the COHERE_API_KEY environment variable.

  2. Create a free-tier Qdrant cluster and set the following environment variables - QDRANT_API_KEY AND QDRANT_HOST.

  3. Install requirements.

cd <project_dir>

conda create -n legal-ease --file requirements.txt

conda activate legal-ease

Usage

In the project dir, run:

python gradio_demo.py

To run the app in reload mode:

gradio gradio_demo.py

The app should typically appear on the url: http://localhost:7860

Legal-ease app

Tools & Technologies used:

  1. Cohere: Cohere offers capability to add cutting-edge language processing to any system. They train large language models with API access. Legal-ease uses Cohere's multilingual-22-12 model to obtain multilingual embeddings, the summarize-xlarge model for summarization and command-xlarge-nightly for question answering.

  2. Qdrant: Qdrant is a vector similarity engine & vector database and comes with an API service for semantic search - searching for the nearest high-dimensional vectors.

  3. Langchain: It is an open source library that provides abstractions for building LLM-based applications

  4. Gradio: The frontend of the application is built using Gradio.

  5. HF Spaces: Hugging Face Spaces offers deployment support for ML applications. Here is the link to our space

References:

Acknowledgements:

About

Your personal legal assistant

License:GNU Affero General Public License v3.0


Languages

Language:Jupyter Notebook 81.5%Language:Python 18.5%