jg-bernard's starred repositories
awesome-web-archiving
An Awesome List for getting started with web archiving
open-parse
Improved file parsing for LLM’s
gpt-crawler
Crawl a site to generate knowledge files to create your own custom GPT from a URL
unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
gpt-researcher
GPT based autonomous agent that does online comprehensive research on any given topic
stringdist
String distance functions for R
anything-llm
The all-in-one Desktop & Docker AI application with full RAG and AI Agent capabilities.
fabricator
[EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.
Giveme5W1H
Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
tweetnlp
TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/understand tweets such as sentiment analysis, emoji prediction, and named entity recognition, powered by state-of-the-art language models specialised on Twitter.
huggingfaceR
Hugging Face state-of-the-art models in R
annotorious
Add image annotation functionality to any web page with a few lines of JavaScript.
recogito-js
A JavaScript library for text annotation