There are 5 repositories under dataset-creation topic.
Image Aesthetics Toolkit - includes Fisher Vector implementation, AVA (Image Aesthetic Visual Analysis) dataset and fast multi-threaded downloader
🥤🧑🏻🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization"
Dataset Helper program to automatically select, re scale and tag Datasets (composed of image and text) for Machine Learning training.
This package is a complete tool for creating a large dataset of images (specially designed -but not only- for machine learning enthusiasts). It can crawl the web, download images, rename / resize / covert the images and merge folders..
Quickly and easily create / train a custom DeepDream model
A uniform library wrapper for input from V4L2,Freenect,OpenNI,OpenNI2,DepthSense,Intel Realsense,OpenGL simulations and other types of video and depth input..
Crawler behind the Shopify App Marketplace dataset
Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework
A script to help you quickly build custom computer vision datasets
110k Dutch Book Reviews Dataset for Sentiment Analysis
A simple app for recording speech datasets.
Code to Blur Human Faces and Vehicle License Plates in Video and Images using a SoTA Object Detection model YOLOv8
Code for paper 'Avoid touching your face: A hand-to-face 3d motion dataset (covid-away) and trained models for smartwatches'
Dataset augmentation with Generative Adversarial Network for crop/weed segmentation
A Python library designed for scraping data from the SCP wiki.
Costa Rican license plate dataset generator
A simple Qt program to easily extract and label samples from videos. Used for dataset creation.
PixelPruner is a user-friendly image cropping app for AI-generated art. It supports PNG, JPG, JPEG, and WEBP formats. Easily crop, preview, and manage images with interactive previews, thumbnail views, rotation tools, and customizable output folders. Streamline your workflow and achieve perfect crops every time with PixelPruner.
We are developing a tool for analyse recorded network traffic in order to detect and investigate about IP source address which may had contribute in a DDoS UDP flood attack. This tool also generates sample pcap datasets.
This repository provides our datasets for Arabic emotion detection in Twitter
Scraper for Japanese street addresses (住所).
A dataset creation tool to aggregate, sort and label large volumes of architectural imagery.
A tool to streamline AI image captioning
Kvasir-SEG: A Segmented Polyp Dataset
A Web Application to collect data from pairwise image comparisons via crowdsourcing
Script for creating a dataset for AI, ML applications
This repository contains Jupyter notebooks detailing the experiments conducted in our research paper on Ukrainian news classification. We introduce a framework for simple classification dataset creation with minimal labeling effort, and further compare several pretrained models for the Ukrainian language.
A program that simulates answers given by a crowd to multiple choice questions with either a single or multiple answers correct, and writes it to a CSV
This project creates the T4SA 2.0 dataset, i.e. a big set of data to train visual models for Sentiment Analysis in the Twitter domain using a cross-modal student-teacher approach.
Blip 2 Captioning, Mass Captioning, Question Answering, and other tools.
The script for parsing sankakucomplex