Kerem Turgutlu's starred repositories

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookLicense:MITStargazers:90758Issues:678Issues:7397

scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Language:PythonLicense:BSD-3-ClauseStargazers:52052Issues:1776Issues:3006

Prompt-Engineering-Guide

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

llama_index

LlamaIndex is a data framework for your LLM applications

Language:PythonLicense:MITStargazers:34460Issues:244Issues:4807

tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonLicense:Apache-2.0Stargazers:15474Issues:105Issues:996

haystack

:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Language:PythonLicense:Apache-2.0Stargazers:15146Issues:132Issues:3455

newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

Language:PythonLicense:MITStargazers:13973Issues:385Issues:674

ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Language:PythonLicense:MITStargazers:10456Issues:284Issues:1544

nebuly

The user analytics platform for LLMs

Language:PythonLicense:Apache-2.0Stargazers:8365Issues:93Issues:202

cortex

Production infrastructure for machine learning at scale

Language:GoLicense:Apache-2.0Stargazers:8010Issues:146Issues:1100

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Language:PythonLicense:MITStargazers:5756Issues:48Issues:968

datasets

TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...

Language:PythonLicense:Apache-2.0Stargazers:4254Issues:109Issues:1154

praw

PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.

Language:PythonLicense:BSD-2-ClauseStargazers:3426Issues:73Issues:751

Promptify

Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3165Issues:47Issues:67

geziyor

Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

Language:GoLicense:MPL-2.0Stargazers:2554Issues:42Issues:55

mctx

Monte Carlo tree search in JAX

Language:PythonLicense:Apache-2.0Stargazers:2291Issues:29Issues:47

tldextract

Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).

Language:PythonLicense:BSD-3-ClauseStargazers:1821Issues:47Issues:203

DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.

Language:PythonLicense:MITStargazers:1123Issues:16Issues:137

openwebtext

Open clone of OpenAI's unreleased WebText dataset scraper. This version uses pushshift.io files instead of the API for speed.

Language:PythonLicense:GPL-3.0Stargazers:703Issues:30Issues:19

Topical-Chat

A dataset containing human-human knowledge-grounded open-domain conversations.

ChatLearner

A chatbot implemented in TensorFlow based on the seq2seq model, with certain rules integrated.

Language:PythonLicense:Apache-2.0Stargazers:539Issues:50Issues:83

GPTZero

An open-source implementation of GPTZero

Language:PythonLicense:MITStargazers:468Issues:10Issues:5

manifest

Prompt programming with FMs.

Language:PythonLicense:Apache-2.0Stargazers:437Issues:22Issues:36

olm-datasets

Pipeline for pulling and processing online language model pretraining data from the web

Language:PythonLicense:Apache-2.0Stargazers:170Issues:12Issues:5

chat_corpus

chat corpus collection from various open sources

abcd

Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Language:PythonLicense:MITStargazers:66Issues:3Issues:2

DST-as-Prompting

Source code for Dialogue State Tracking with a Language Model using Schema-Driven Prompting

Language:VueLicense:GPL-2.0Stargazers:15Issues:1Issues:3