jg-bernard's starred repositories

langchain

🦜🔗 Build context-aware reasoning applications

Language:PythonLicense:MITStargazers:88217Issues:666Issues:7048

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:63786Issues:531Issues:0

tesseract

Tesseract Open Source OCR Engine (main repository)

Language:C++License:Apache-2.0Stargazers:59384Issues:1682Issues:2615

CyberChef

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

Language:JavaScriptLicense:Apache-2.0Stargazers:27141Issues:384Issues:938

EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Language:PythonLicense:Apache-2.0Stargazers:22720Issues:308Issues:963

zincsearch

ZincSearch . A lightweight alternative to elasticsearch that requires minimal resources, written in Go.

Language:GoLicense:NOASSERTIONStargazers:16674Issues:157Issues:291

gallery-dl

Command-line program to download image galleries and collections from several image hosting sites

Language:PythonLicense:GPL-2.0Stargazers:10680Issues:140Issues:4672

awesome-prompts

Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.

archiver

Easily create & extract archives, and compress & decompress files of various formats

pyload

The free and open-source Download Manager written in pure Python

Language:PythonLicense:NOASSERTIONStargazers:3214Issues:133Issues:3169

twitter-archive-parser

Python code to parse a Twitter archive and output in various ways

Language:PythonLicense:GPL-3.0Stargazers:2393Issues:34Issues:82

Gaffer

A large-scale entity and relation database supporting aggregation of properties

Language:JavaLicense:Apache-2.0Stargazers:1747Issues:140Issues:1559

diskover-community

Diskover Community Edition - Open source file indexer, file search engine and data management and analytics powered by Elasticsearch

Language:PHPLicense:Apache-2.0Stargazers:1422Issues:54Issues:86

keras-ocr

A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.

Language:PythonLicense:MITStargazers:1356Issues:51Issues:209

unavatar

Get unified user avatar from social networks, including Instagram, SoundCloud, Telegram, Twitter, YouTube & more.

Language:JavaScriptLicense:MITStargazers:1143Issues:8Issues:49

tick

Module for statistical learning, with a particular emphasis on time-dependent modelling

Language:PythonLicense:BSD-3-ClauseStargazers:478Issues:36Issues:244

cc-pyspark

Process Common Crawl data with Python and Spark

Language:PythonLicense:MITStargazers:393Issues:21Issues:25

LlamaGPTJ-chat

Simple chat program for LLaMa, GPT-J, and MPT models.

Language:C++License:MITStargazers:212Issues:7Issues:23
Language:RLicense:NOASSERTIONStargazers:176Issues:7Issues:34

TwiBot-22

Offical repository of TwiBot-22 @ NeurIPS 2022, Datasets and Benchmarks Track.

Language:PythonLicense:MITStargazers:141Issues:6Issues:39

cc-index-table

Index Common Crawl archives in tabular format

Language:JavaLicense:Apache-2.0Stargazers:99Issues:13Issues:22

askgpt

A chat interface build on top of OpenAI's API endpoints

Language:RLicense:GPL-3.0Stargazers:53Issues:3Issues:7

RedditHarbor

Ethical, legal, and effortless extraction of Reddit data in your database

Language:PythonLicense:MITStargazers:42Issues:2Issues:10

CooRTweet

CooRTweet: Coordinated Networks Detection on Social Media | Detects a variety of coordinated actions on social media and outputs the network of coordinated users along with related information.

Language:RLicense:NOASSERTIONStargazers:31Issues:4Issues:19

hawkesbook

[Python Package] Code from 'The Elements of Hawkes Processes' Book

Language:PythonLicense:MITStargazers:23Issues:3Issues:1

memespector

A simple script for using Google's Vision API that will possibly develop into an actual tool.

Language:PHPLicense:UnlicenseStargazers:13Issues:4Issues:0

reddit-tools

a bunch of scripts for investigaing reddit

Language:PHPLicense:UnlicenseStargazers:11Issues:0Issues:0

sdk-python-zincsearch

Python SDK Client for ZincSearch

Language:PythonLicense:Apache-2.0Stargazers:9Issues:3Issues:0

Covid-Misinformation-Dataset

Dataset of YouTube videos with Covid-related misinformation

Language:PythonLicense:NOASSERTIONStargazers:4Issues:1Issues:0