jdphilius's starred repositories
gpt-code-ui
An open source implementation of OpenAI's ChatGPT Code interpreter
TextAttack
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
High-Frequency-Trading-Model-with-IB
A high-frequency trading model using Interactive Brokers API with pairs and mean-reversion in Python
scattertext
Beautiful visualizations of how language differs among document types.
ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
codequestion
🔎 Semantic search for developers
googlesearch
A Python library for scraping the Google search engine.
docTTTTTquery
docTTTTTquery document expansion model
sentence-splitter
Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
doppel-bot
Train a language model to answer Slack messages as you.
relevanceai
Home of the AI workforce - Multi-agent system, AI agents & tools
sentence-doctor
Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of the art SBD, they often depend on text extractors (e.g pdf text extractors or OCR). The quality of these extractors greatly influence the quality of SBD libraries and as a consequence, the performance of downstream models as well. To help address this problem, we fine-tuned a T5 model from the hugging face hub that attempts to reconstruct “broken sentences”