aashishkumar0228 / multi-mnist

MNIST dataset with multiple digits. This dataset can be use for learning number (more than 1 digit) regconizer model.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

multi-mnist

MNIST data set with multiple digits. We have generated a dataset for multiple digits recognition task from MNIST (http://yann.lecun.com/exdb/mnist/index.html).

See in examples folder:

examples
    - train
        + labels.csv
        + 1/
        + 2/
        ...
        + 8/
        + 9/
        
    - test
        + labels.csv
        + 1/
        + 2/
        ...
        + 8/
        + 9/    

Each folder 1, 2, 3, 4, ..., contains generated images with exactly number of digits as the name of folder. labels.csv list name of image and ground truth number respectively.

labels.csv
1.png,1
2.png,4
3.png,45
4.png,785,
5.png,1479,
...

Create your own dataset

Clone this repository:

git clone https://github.com/aashishkumar0228/multi-mnist.git

Requirements:

  • python 3
  • numpy
  • idx2numpy
  • tqdm
  • opencv-python

Install requirements:

pip3 install -r requirements.txt

Change some parameter in main.py:

  • output_dir: Path to your expected output directory.
  • number_of_samples_per_class: Number of samples for each number of digit.

Run python main.py and take a look at output_dir.

About

MNIST dataset with multiple digits. This dataset can be use for learning number (more than 1 digit) regconizer model.

License:MIT License


Languages

Language:Jupyter Notebook 63.6%Language:Python 36.4%