Clément Doumouro (ClemDoum)

ClemDoum

Geek Repo

Company:@snipsco

Location:France

Github PK Tool:Github PK Tool

Clément Doumouro's starred repositories

eng-practices

Google's Engineering Practices documentation

License:NOASSERTIONStargazers:19807Issues:0Issues:0

parserator

:bookmark: A toolkit for making domain-specific probabilistic parsers

Language:PythonLicense:MITStargazers:790Issues:0Issues:0

jellyfish

🪼 a python library for doing approximate and phonetic matching of strings.

Language:Jupyter NotebookLicense:MITStargazers:2003Issues:0Issues:0

zentity

Entity resolution for Elasticsearch.

Language:JavaLicense:Apache-2.0Stargazers:154Issues:0Issues:0

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language:PythonLicense:Apache-2.0Stargazers:1060Issues:0Issues:0

recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

Language:PythonLicense:BSD-3-ClauseStargazers:917Issues:0Issues:0

llm-movieagent

Semantic layer on top of a graph database to provide an LLM with a set of robust tools to interact with the database

Language:PythonLicense:MITStargazers:167Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:404Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:122Issues:0Issues:0

unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Language:HTMLLicense:Apache-2.0Stargazers:7107Issues:0Issues:0

instructor

structured outputs for llms

Language:PythonLicense:MITStargazers:6058Issues:0Issues:0

zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Language:JavaLicense:AGPL-3.0Stargazers:906Issues:0Issues:0
Language:PythonLicense:MITStargazers:481Issues:0Issues:0

unoconv

Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.

Language:PythonLicense:GPL-2.0Stargazers:2532Issues:0Issues:0

nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

Language:Jupyter NotebookLicense:MITStargazers:243Issues:0Issues:0

Awesome-Prompt-Engineering

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

Language:PythonLicense:Apache-2.0Stargazers:3380Issues:0Issues:0

nlm-ingestor

This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.

Language:PythonLicense:Apache-2.0Stargazers:873Issues:0Issues:0

llmsherpa

Developer APIs to Accelerate LLM Projects

Language:Jupyter NotebookLicense:MITStargazers:1075Issues:0Issues:0

Parsr

Transforms PDF, Documents and Images into Enriched Structured Data

Language:JavaScriptLicense:Apache-2.0Stargazers:5683Issues:0Issues:0

rome

Locating and editing factual associations in GPT (NeurIPS 2022)

Language:PythonLicense:MITStargazers:510Issues:0Issues:0

textract-cli

CLI for running files through AWS Textract

Language:PythonLicense:Apache-2.0Stargazers:50Issues:0Issues:0

SubMusic

Sync music and podcasts to your Garmin watch from your own SubSonic or Ampache server

Language:Monkey CLicense:GPL-3.0Stargazers:108Issues:0Issues:0

llm-graph-builder

Neo4j graph construction from unstructured data

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:221Issues:0Issues:0

GLiNER

Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024

Language:PythonLicense:Apache-2.0Stargazers:771Issues:0Issues:0

fairlearn

A Python package to assess and improve fairness of machine learning models.

Language:PythonLicense:MITStargazers:1832Issues:0Issues:0

git-filter-repo

Quickly rewrite git repository history (filter-branch replacement)

Language:PythonLicense:NOASSERTIONStargazers:7622Issues:0Issues:0

llama_parse

Parse files for optimal RAG

Language:PythonLicense:MITStargazers:1462Issues:0Issues:0

nicar_ocr

A tutorial on optical character recognition using tesseract, ImageMagick and other open source tools

Language:Jupyter NotebookLicense:MITStargazers:66Issues:0Issues:0

hatchet

A distributed, fault-tolerant task queue

Language:GoLicense:MITStargazers:3377Issues:0Issues:0

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:10891Issues:0Issues:0