Catarina Belém's repositories
unscrambler
NLP-based system whose goal is to recover corrupted files. To this end, we are given both a training and a test set, where the test set is assumed to be generated from the same distribution as the training set.
ADOPT.jl
This is the result of my master thesis on Multi-Objective Optimization. This repository is more focused towards Pareto-based optimization rather than SIngle-Objective optimization with preference articulation. We focus on time-consuming optimization routines and, as a result we focus on model-based methods to allow for faster convergence times. This is relevant for Architectural Design Optimization, which depends on time-intensive simulations (e.g. minutes, hours or even days to complete a single simulation).
clinical-data-ml-metadata
Clinical ML metadata specification for columns found available in MIMIC and eICU datasets after being preprocessed with the YAIB-Cohort package.
dotfiles
Get ready for dotfiles. Contains i3, i3blocks, rofi, dunst, picom, vim, tmux, and zsh.
ecaade-2018-from-design-to-optimized-design
Results and scripts used to reproduce the analysis reported on the 2018 paper at eCAADe, whose title is "From Design to Optimized Design: An Algorithmic Approach".
generative-calibration-lms
Calibration of autoregressive language models.
uci-statnlp
Code for the StatNLP course homework.
handwritten-recognition-and-detection
Repository concerning the creation of a system capable of detecting handwritting and recognizing its content. This system will be applied to images involving visual elements other than just handwritting.
marginalization
Source code for the ACL'23 paper "Should you marginalize over possible tokenizations?"
ml4nlp-cogsci-summer22
Course materials for the Machine Learning for NLP course taught by Sameer Singh for the Cognitive Science summer school 2022.
openlogprobs
Extract full next-token probabilities via language model APIs
pastelbelem8.github.io
Catarina's personal webpage (adapted from al-folio Jekyll template).
PyTorch-VAE
A Collection of Variational Autoencoders (VAE) in PyTorch.
ScikitLearn.jl
Julia implementation of the scikit-learn API
ucimlrepo
Python package for dataset imports from UCI ML Repository
zafar-fair-classification
Refactor upon Zafar's fair classification bias mitigation method.