almugabo

followers

following

stars

almugabo's repositories

openalex_qa

Assessing and Improving data quality in OpenAlex

Language:Python11 2 3

grant_db

data on research funding of selected agencies/programs

Language:Python3 40

open_metadata

an overview of open scientometric resources

2 20

reference_processing

a repo with scripts to finetune an LLM to process bibliographic references of scholarly documents

Language:Jupyter NotebookApache-2.02 10

africanlp-public-datasets

A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.

000

examgen

A Python class that can automatically generate mathematics exams, with solution keys, using Sympy and LaTeX.

Language:PythonNOASSERTION000

google-gemma-finetuning-n2sql

Finetuning Google's Gemma Model for Translating Natural Language into SQL

Language:Jupyter NotebookApache-2.0000

grants_dataset

an open dataset of data on research funding of selected agencies/programs .

Language:Jupyter NotebookMIT000

KATE

Code & data accompanying the KDD2017 paper "KATE: K-Competitive Autoencoder for Text"

Language:PythonBSD-3-Clause010

LMFit

experiments in finetuning language models

Language:Roff010

maths_S1S3

a repo with Maths topics

010

NLP-Projects-NHV

NLP Projects playlist

Language:Jupyter Notebook000

opensearch-gitpod-test

A Docker Compose template, configured for Gitpod (www.gitpod.io) to give you pre-built, ephemeral development environments in the cloud.

MIT000

ref_scholarly_docs

this repo containts some work on curating lists of references in non traditional scholarly documents. work in progress

01 1

test_dev

repository to quickly test things. will be periodically deleted

Language:Jupyter Notebook000

TextRL

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on small scale model (any generation model in hugging face's transformers)

000

trl

Train transformer language models with reinforcement learning.

Apache-2.0000