There are 5 repositories under huggingface-datasets topic.
Generic template to bootstrap your PyTorch project.
[EMNLP 2022] Unifying and multi-tasking structured knowledge grounding with language models
Use the universal VDF format for vector datasets to easily export and import data from all vector databases
Retrieves parquet files from Hugging Face, identifies and quantifies junky data, duplication, contamination, and biased content in dataset using pandas
Pytorch-like dataloaders in JAX.
Automate Fashion Image Captioning using BLIP-2. Automatic generating descriptions of clothes on shopping websites, which can help customers without fashion knowledge to better understand the features (attributes, style, functionality etc.) of the items and increase online sales by enticing more customers.
Translate large dataset to any language with google translation api and multithread processing, no key required !
huggingface-go : 加速下载 huggingface 的模型和数据集
NLP model that predicts subreddit based on the title of a post
🫁 AeroPath: An airway segmentation benchmark dataset with challenging pathology
Getting started with Hugging Face
Use AI to personify books, so that you can talk to them 🙊
A Python implementation of Toolformer using Huggingface Transformers
JGLUE: Japanese General Language Understanding Evaluation for huggingface datasets
diffusion model for unconditional image generation of Bored Apes
Fine-tuned and compared 3 🤗 pre-trained Multilingual LLMs
EHRM [ Electronic Health Record Management ] introduces a centralized platform for analyzing patient records, offering insights into billing amounts, demographics, prevalent diagnoses, medical conditions, consulted doctors, admission types, and medication usage.
Configuration files for building E621-Rising v3 SDXL model and dataset
This is a course project for DSCI-6011 - Deep Learning. deals with Drivable Area and lane segmentation for self driving cars
WRIME for huggingface datasets
Scraping large amount of articles for transformer training.
Rico: A Mobile App Dataset for Building Data-Driven Design Applications for huggingface datasets
COCOA: Semantic Amodal Segmentation for huggingface datasets
Microsoft COCO: Common Objects in Context for huggingface datasets
Pre-Training and Fine-Tuning transformer models using PyTorch and the Hugging Face Transformers library. Whether you're delving into pre-training with custom datasets or fine-tuning for specific classification tasks, these notebooks offer explanations and code for implementation.
Proyecto curso MDS7201-1, en conjunto con el Centro de Modelamiento Matemático (CMM)
CAMERA (CyberAgent Multimodal Evaluation for Ad Text GeneRAtion) for huggingface datasets
Magazine dataset from Content-aware Generative Modeling of Graphic Design Layouts for huggingface datasets
A comprehensive toolkit for seamless data generation and fine-tuning of NLP models, all conveniently packed into a single block.
[INLG2023] The High-Level (HL) dataset is a Vision and Language (V&L) resource aligning object-centric descriptions from COCO with high-level descriptions crowdsourced along 3 axes: scene, action, rationale.
Fine-tuning pretrained BERT model for sentiment analysis (text classification)
cookiecutter for huggingface datasets
JSNLI (Japanese SNLI) dataset for huggingface datasets
Japanese Livedoor news corpus for huggingface datasets