There are 229 repositories under dataset topic.
A collective list of free APIs
Label Studio is a multi-type data labeling and annotation tool with standardized output format
A MNIST-like fashion product database. Benchmark :point_down:
pix2tex: Using a ViT to convert images of equations into LaTeX code.
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Techniques for deep learning with satellite & aerial imagery
Curated list of Machine Learning, NLP, Vision, Recommender Systems Project Ideas
Documentation on how to access and use the Quick, Draw! Dataset.
This repository contains compatibility data for Web technologies as displayed on MDN
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
SQL Translator is a tool for converting natural language queries into SQL code using artificial intelligence. This project is 100% free and open source.
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
A curated list of awesome JSON datasets that don't require authentication.
A synthetic data generator for text recognition
We are building an open database of COVID-19 cases with chest X-ray or CT images.
Extract data from a wide range of Internet sources into a pandas DataFrame.
📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.
Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!
Large list of handpicked color names 🌈
Transformer: PyTorch Implementation of "Attention Is All You Need"
Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Basic Utilities for PyTorch Natural Language Processing (NLP)
[ECCV 2018] CCPD: a diverse and well-annotated dataset for license plate detection and recognition