sidney_NLP's repositories

Awesome-Multi-label-Image-Recognition

Awesome Multi-label Image Recognition Paper List

Stargazers:0Issues:0Issues:0

Classical-Modern

非常全的文言文(古文)-现代文平行语料

License:MITStargazers:0Issues:0Issues:0

clearml

ClearML - Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

cmu_multilingual_speech

CMU multilingual speech repository

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

composer

library of speed-up algorithms for model training

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

dataset_difficulty

"Understanding Dataset Difficulty with V-Usable Information" (ICML 2022, outstanding paper)

Stargazers:0Issues:0Issues:0

extend

Entity Disambiguation as text extraction (ACL 2022)

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

facestar

Facestar dataset. High quality audio-visual recordings of human conversational speech.

License:NOASSERTIONStargazers:0Issues:0Issues:0

famie

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

License:GPL-3.0Stargazers:0Issues:0Issues:0

FREDA

Fast and Flexible Data Annotation for Relation Extraction

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

huggingsound

HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

License:MITStargazers:0Issues:0Issues:0

IndicLink

IndicLink is a Multilingual Fact Linking (MFL) dataset of sentences and a set of WikiData facts (subject; relation; object) contained in each sentence. IndicLink contains sentences from English and 6 Indian languages - Hindi, Telugu, Tamil, Urdu, Gujarati and Assamese. The correct facts are chosen from an oracle of 4.7 million Wikidata facts with fact labels/descriptions available in these 7 languages. The dataset is intended only to act as a test set to evaluate models trained for the task of MFL. For more details, please see https://arxiv.org/abs/2109.14364

License:NOASSERTIONStargazers:0Issues:0Issues:0

lab-website-template

(Pre-release) An easy-to-use, flexible website template for labs, with automatic citations, GitHub tag imports, pre-built components, and more

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

lightning-hydra-template

PyTorch Lightning + Hydra. A very user-friendly template for rapid and reproducible ML experimentation with best practices. ⚡🔥⚡

Stargazers:0Issues:0Issues:0

lingfeat

LingFeat - A Comprehensive Linguistic Features Extraction ToolKit for Readability Assessment

Language:PythonLicense:CC-BY-SA-4.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

NS-Dial

An Interpretable Neuro-Symbolic Framework for Task-Oriented Dialogue Generation

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

python_plot_utils

A simple code for plotting figure, colorbar, and cropping with python

Stargazers:0Issues:0Issues:0

SELFRec

An open-source framework for self-supervised recommender systems.

Stargazers:0Issues:0Issues:0

TempEL

Repository for Temporal Entity Linking (TempEL), accepted to NeurIPS 2022 Dataset and Benchmarks

License:Apache-2.0Stargazers:0Issues:0Issues:0

timelms

TimeLMs: Diachronic Language Models from Twitter

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

tools

实用工具:markdown写PPT、命令行自动演示工具、前端组件库等

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

txtai

💡 Build AI-powered semantic search applications

License:Apache-2.0Stargazers:0Issues:0Issues:0

video2dataset

Easily create large video dataset from video urls

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

wikipedia-utils

Utility scripts for preprocessing Wikipedia texts for NLP

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

yahp

hyperparameter management

License:NOASSERTIONStargazers:0Issues:0Issues:0