jveitchmichaelis / deeplabel

A cross-platform desktop image annotation tool for machine learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

export limit is 999 and cannot delete imported picture

brianng0305 opened this issue · comments

I would like to export 1200 nos darknet labels but it can only export up to nos 999. <- it solved after I clicked random export.

After that I want to delete pictures I did not labelled but no button for this function.

Your issue suggests a problem with the split function.

void BaseExporter::splitData(float split, bool shuffle, int seed){

So we pick a pivot in the dataset based on the train/val split:

int pivot = static_cast<int>(images.size() * split);

and the only difference when shuffling is this line:

std::shuffle(images.begin(), images.end(), generator);

where generator is the random ordering. I can't see anything obvious there, or at least why would randomly shuffling allow you to export the full dataset?

The export function just copies the images in the train and val lists, so it would be good to know if this problem is a result of the split data or if it's a bug in the copy step. It's a foreach over images which makes no reference to the size of the list.

@brianng0305 What does your terminal/console window say? Eg without shuffling do you get the train/val set sizes reported correctly?

And how big is your labelled dataset?

I find it weird that there would be an arbitrary limit at 999 when there's no explicit counters in the code. Could it be related to your image names somehow? Duplicate filenames should be fine, by the way.

By the way, there is a "remove image" button - so you would like an additional option to remove all unlabelled images? That's straightforward for me to add. Though you know you can export only labelled images, if that helps?

SIr,

Your issue suggests a problem with the split function.

void BaseExporter::splitData(float split, bool shuffle, int seed){

So we pick a pivot in the dataset based on the train/val split:

int pivot = static_cast<int>(images.size() * split);

and the only difference when shuffling is this line:

std::shuffle(images.begin(), images.end(), generator);

where generator is the random ordering. I can't see anything obvious there, or at least why would randomly shuffling allow you to export the full dataset?

The export function just copies the images in the train and val lists, so it would be good to know if this problem is a result of the split data or if it's a bug in the copy step. It's a foreach over images which makes no reference to the size of the list.

@brianng0305 What does your terminal/console window say? Eg without shuffling do you get the train/val set sizes reported correctly?

And how big is your labelled dataset?

I find it weird that there would be an arbitrary limit at 999 when there's no explicit counters in the code. Could it be related to your image names somehow? Duplicate filenames should be fine, by the way.

Sir. yes. I set the name from 1 to 1200 since the original files contained symbol and I don't want to have this to cause error if any. For example, 1200.jpg. My labelled dataset is 300. Please note that I have solved the problems after I clicked the random bottom to extract label.

What I am going to say is the current application is no problem at all.

Concerning with the delete, original my intent is to add a buttom to delete pictures from xxx to xxx, or open a filedialog to select what to delete. After think twice I shall do some preprocess to import photo is better.

Thank you very much.

Ah ok, I can add a delete range function, it may be useful for others. And deleting unlabelled images should also be. OK.

I'll see if I can replicate your split issue - this happens with your renamed files? Or is it perhaps due to the special characters?

Files not contained special characters. I seldom use C so I don't know. haha

Still not able to replicate this on projects containing several thousand images. Closing for now.