There are 77 repositories under datasets topic.
A topic-centric list of HQ open datasets.
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Label Studio is a multi-type data labeling and annotation tool with standardized output format
pix2code: Generating Code from a Graphical User Interface Screenshot
Techniques for deep learning with satellite & aerial imagery
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
搜索所有中文NLP数据集,附常用英文NLP数据集
A curated list of awesome JSON datasets that don't require authentication.
Papers and Datasets about Point Cloud.
Datasets, tools, and benchmarks for representation learning of code.
:pencil2: Web-based image segmentation tool for object detection, localization, and keypoints
Medical NLP Competition, dataset, large models, paper
Colour Science for Python
Benchmark datasets, data loaders, and evaluators for graph machine learning
C++ Implementation of PyTorch Tutorials for Everyone
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
In-memory tabular data in Julia
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Large datasets for conversational AI
Community list of transit APIs, apps, datasets, research, and software :bus::star2::train::star2::steam_locomotive:
A curated list of amazingly awesome Cybersecurity datasets