s20ss's starred repositories

awesome-chatgpt-prompts

This repo includes ChatGPT prompt curation to use ChatGPT better.

Language:HTMLLicense:CC0-1.0Stargazers:107636Issues:0Issues:0

graphql-engine

Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.

Language:TypeScriptLicense:Apache-2.0Stargazers:30981Issues:0Issues:0

scispacy

A full spaCy pipeline and models for scientific/biomedical documents.

Language:PythonLicense:Apache-2.0Stargazers:1657Issues:0Issues:0

free-programming-books

:books: Freely available programming books

License:CC-BY-4.0Stargazers:330162Issues:0Issues:0

Semantic-UI

Semantic is a UI component framework based around useful principles from natural language.

Language:JavaScriptLicense:MITStargazers:51078Issues:0Issues:0

whoosh

Pure-Python full-text search library

Language:PythonLicense:NOASSERTIONStargazers:559Issues:0Issues:0

FlexNeuART

Flexible classic and NeurAl Retrieval Toolkit

Language:JavaLicense:Apache-2.0Stargazers:212Issues:0Issues:0

duobert

Multi-stage passage ranking: monoBERT + duoBERT

Language:PythonStargazers:112Issues:0Issues:0

hello-open-source

Your milestone to say Hello to the world of Open Source !

Language:C++License:MITStargazers:18Issues:0Issues:0

sentence-doctor

Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of the art SBD, they often depend on text extractors (e.g pdf text extractors or OCR). The quality of these extractors greatly influence the quality of SBD libraries and as a consequence, the performance of downstream models as well. To help address this problem, we fine-tuned a T5 model from the hugging face hub that attempts to reconstruct “broken sentences”

Language:PythonStargazers:60Issues:0Issues:0

textsplit

Segment documents into coherent parts using word embeddings.

Language:Jupyter NotebookLicense:MITStargazers:145Issues:0Issues:0

pdf2docx

Open source Python library for converting PDF to DOCX.

Language:PythonLicense:AGPL-3.0Stargazers:2371Issues:0Issues:0

colabcode

Run VSCode (codeserver) on Google Colab or Kaggle Notebooks

Language:PythonLicense:MITStargazers:2056Issues:0Issues:0

Questgen.ai

Question generation using state-of-the-art Natural Language Processing algorithms

Language:PythonLicense:MITStargazers:885Issues:0Issues:0

Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

Language:PythonLicense:BSD-3-ClauseStargazers:2903Issues:0Issues:0

vscode-debug-visualizer

An extension for VS Code that visualizes data during debugging.

Language:TypeScriptLicense:GPL-3.0Stargazers:7851Issues:0Issues:0

language

Shared repository for open-sourced projects from the Google AI Language team.

Language:PythonLicense:Apache-2.0Stargazers:1588Issues:0Issues:0

unifiedqa

UnifiedQA: Crossing Format Boundaries With a Single QA System

Language:PythonLicense:Apache-2.0Stargazers:427Issues:0Issues:0

textaugment

TextAugment: Text Augmentation Library

Language:PythonLicense:MITStargazers:387Issues:0Issues:0

sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

Language:PythonLicense:Apache-2.0Stargazers:14554Issues:0Issues:0

txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Language:PythonLicense:Apache-2.0Stargazers:8191Issues:0Issues:0

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Language:PythonLicense:Apache-2.0Stargazers:18815Issues:0Issues:0

nboost

NBoost is a scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on different platforms (i.e. Elasticsearch)

Language:PythonLicense:Apache-2.0Stargazers:674Issues:0Issues:0

pygaggle

a gaggle of deep neural architectures for text ranking and question answering, designed for Pyserini

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:332Issues:0Issues:0

onnxt5

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Language:PythonLicense:Apache-2.0Stargazers:248Issues:0Issues:0

ClariQ

ClariQ: SCAI Workshop data challenge on conversational search clarification.

Language:Jupyter NotebookStargazers:128Issues:0Issues:0

texthero

Text preprocessing, representation and visualization from zero to hero.

Language:PythonLicense:MITStargazers:2879Issues:0Issues:0

grobid

A machine learning software for extracting information from scholarly documents

Language:JavaLicense:Apache-2.0Stargazers:3308Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:290Issues:0Issues:0

Python

All Algorithms implemented in Python

Language:PythonLicense:MITStargazers:182585Issues:0Issues:0