sotirismos / custom-dataset-generation

Custom dataset generation for image classification based on images downloaded from Google

Repository from Github https://github.comsotirismos/custom-dataset-generationRepository from Github https://github.comsotirismos/custom-dataset-generation

Custom Dataset Generation

Custom dataset generation for image classification based on images downloaded from Google.

  • image_download.py is a script containing class methods for images downloading and saving, plus the creation of a directory to store those downloaded images.

  • logging.py is an auxiliary script used for logging.

  • train_test_split.py is a script containing functions to split the downloaded images to train, test subsets based on train_ratio argument of split function.

  • detect_n_crop.py applies a deep learning model pretrained on BDD100K dataset and is able to detect the following objects from the downloaed images.

    • Pedestrian, Rider, Car, Truck, Bus, Train, Motorcycle, Bicycle, Traffic light, Traffic sign.

    In our case, we downloaded 20 pictures for each of the top-selling car models in Greece for 2021, applied the car detection model to clean up the dataset, resulting to a car brand detection dataset with minimal effort, as analyzed in the notebook.

About

Custom dataset generation for image classification based on images downloaded from Google


Languages

Language:Jupyter Notebook 95.9%Language:Python 4.1%