There are 89 repositories under datasets topic.
A topic-centric list of HQ open datasets.
Label Studio is a multi-type data labeling and annotation tool with standardized output format
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
pix2code: Generating Code from a Graphical User Interface Screenshot
Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Techniques for deep learning with satellite & aerial imagery
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
FL Chart is a highly customizable Flutter chart library that supports Line Chart, Bar Chart, Pie Chart, Scatter Chart, Radar Chart and Candlestick Chart.
(CGCSTCD'2017) An easy, flexible, and accurate plate recognition project for Chinese licenses in unconstrained situations. CGCSTCD = China Graduate Contest on Smart-city Technology and Creative Design
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
搜索所有中文NLP数据集,附常用英文NLP数据集
A curated list of awesome JSON datasets that don't require authentication.
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
Papers and Datasets about Point Cloud.
Colour Science for Python
Datasets, tools, and benchmarks for representation learning of code.
Medical NLP Competition, dataset, large models, paper
:pencil2: Web-based image segmentation tool for object detection, localization, and keypoints
C++ Implementation of PyTorch Tutorials for Everyone
A list of awesome papers and resources of recommender system on large language model (LLM).
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
Benchmark datasets, data loaders, and evaluators for graph machine learning
A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
In-memory tabular data in Julia
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard