Shashank's starred repositories
powerlevel10k
A Zsh theme
text-generation-webui
A Gradio web UI for Large Language Models.
airflow-maintenance-dags
A series of DAGs/Workflows to help maintain the operation of Airflow
awesome-ml
Curated list of useful LLM / Analytics / Datascience resources
distilabel
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
zotero-deb
Packaged versions of Zotero and Juris-M for Debian-based systems
OpenContracts
Mass document analytics platform based on LlamaIndex, Pgvector, React and Django.
label-studio-ml-backend
Configs and boilerplates for Label Studio's Machine Learning backend
SpanMarkerNER
SpanMarker for Named Entity Recognition
python-gatenlp
Python text processing, pattern matching, and NLP framework
GremlinServer
A low-code microservices platform designed for legal engineers. Given a document, Gremlin will apply a series of Python scripts to it and return transformed documents and/or extracted data. Use with GremlinUI for an open source, modern, React-based low-code experience (https://github.com/JSv4/GremlinGUI)
PDFSegmenter
This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.
measuring-language-complexity
Kolmogorov complexity, language complexity, compression
DMLPlayground
This repository contains code for training and evaluating various Deep Metric Learning (DML) algorithms on the CUB200-2011, Cars196 and SOP datasets.
active-learning-in-ehealth
Active Learning for Name Entity Recognition on eHealth Corpus