Ali Hürriyetoğlu's starred repositories
RAGatouille
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
bert_score
BERT score for text generation
rag-demystified
An LLM-powered advanced RAG pipeline built from scratch
awesome-open-data-centric-ai
Curated list of open source tooling for data-centric AI on unstructured data.
gutenberg-dammit
I wanted all of plaintext Project Gutenberg in an easy-to-use format, so I made this
nlp-llms-resources
Master list of curated resources on NLP and LLMs
CausalNewsCorpus
Repository for Causal News Corpus (LREC 2022) and RECESS (IJCNLP-AACL 2023)
css_methods_python
A full course of self-explanatory and freely available materials on CSS methods
case-2021-shared-task
Information and data related to the ProtestNews shared task at CASE @ ACL-IJCNLP 2021 workshop
text_characterization_toolkit
A library for computing diverse text characteristics and using them to analyze data sets and models with ease.
Secim2023_Dataset
Reproduction material for "#Secim2023: First Public Dataset for Studying Turkish General Election"
KB-python-API
Python API for KB data-services
historical_texts
BigScience working group on language models for historical texts
NLP-for-Historical-Text
Overview of tooling for and issues with NLP on historical texts, dealing with OCR/HTR errors and spelling variation and change
food_crisis_predictions_nlp
Replication package for "Fine-grained prediction of food crises from news streams"
general_info
Contains any sharable info of emerging-welfare team
thesis_log
All the work of open science principles for my thesis at Koç University Computational Social Sciences MA program.