Ruan Chaves's repositories

hashformers

Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).

Language:PythonLicense:MITStargazers:63Issues:6Issues:11

napolab

A Natural Portuguese Language Benchmark (Napolab) for the evaluation of language models.

Language:PythonLicense:MITStargazers:51Issues:6Issues:1

elmo

Supporting code for the paper "Portuguese Language Models and Word Embeddings: Evaluating on Semantic Similarity Tasks".

Language:Jupyter NotebookStargazers:11Issues:4Issues:0
Language:HTMLStargazers:1Issues:2Issues:0

pdfsandwich-cli

A command line interface for a Dockerized instance of pdfsandwich hosted on AWS EC2.

Language:JavaScriptStargazers:1Issues:2Issues:0

prawstreams

Fetch live comments, submissions and inbox messages from Reddit, either locally or remotely from a Heroku dyno and Flask website.

Language:PythonLicense:MITStargazers:1Issues:2Issues:0
Language:PythonLicense:MITStargazers:0Issues:3Issues:0
Language:PythonStargazers:0Issues:2Issues:0

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Language:PythonLicense:Apache-2.0Stargazers:0Issues:2Issues:0

Easy-Translate

Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

epoxy

Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings

Language:PythonLicense:Apache-2.0Stargazers:0Issues:2Issues:0
Language:Jupyter NotebookStargazers:0Issues:2Issues:0

HateBR

HateBR is the first large-scale expert annotated corpus of Brazilian Instagram comments for hate speech and offensive language detection on the web and social media.

Stargazers:0Issues:1Issues:0

kNN-CUDA

Fast k nearest neighbor search using GPU

Language:CudaLicense:NOASSERTIONStargazers:0Issues:2Issues:0

ljvmiranda921.github.io

✨ Github repository for my website

License:CC-BY-4.0Stargazers:0Issues:0Issues:0
Language:JavaScriptLicense:MITStargazers:0Issues:2Issues:0

minicons

Utility for analyzing Transformer based representations of language.

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

mlm-scoring

Python library & examples for Masked Language Model Scoring (ACL 2020)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:2Issues:0

prawchive

An one-click deploy archive bot for Reddit that runs on Heroku

Language:PythonLicense:MITStargazers:0Issues:2Issues:0

pysentimiento

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:2Issues:0

reddit_keywords

Code to extract Reddit comments and submissions from Pushshift dumps based on keywords.

Language:Jupyter NotebookStargazers:0Issues:3Issues:0
Language:PythonLicense:MITStargazers:0Issues:2Issues:0
Language:PythonStargazers:0Issues:2Issues:0
Stargazers:0Issues:2Issues:0

ruanchaves.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptLicense:MITStargazers:0Issues:2Issues:0

rubrix

✨ Python framework for data-centric NLP

Language:PythonLicense:Apache-2.0Stargazers:0Issues:2Issues:0
Language:PythonStargazers:0Issues:2Issues:0

skweak

skweak: A software toolkit for weak supervision applied to NLP tasks

Language:PythonLicense:MITStargazers:0Issues:2Issues:0

xlm-t

Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:2Issues:0