softwaremill / lemon-dataset

Lemons quality control dataset

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Lemons quality control dataset 🍋

DOI License: MIT version

Lemon dataset has been prepared to investigate the possibilities to tackle the issue of fruit quality control. It contains 2690 annotated images (1056 x 1056 pixels). Raw lemon images have been captured using the procedure described in the following blogpost and manually annotated using CVAT.

Here's an example of raw unannotated data:

and some annotated samples:

Labels

Name Attribute Type Default Description Example
condition healthy boolean true Determine whether the fruit is healthy. If not regions with identified issues are annotated
condition greening boolean false Determine whether the fruit contains areas that are not uniformly yellow and have green areas.
image_quality blurry boolean false Fruit image is blurry
image_quality cropped boolean false Not all fruit parts are on the image
image_quality unnatural_color boolean false There are issues with color representation.
image_quality no_data boolean false There are black spots on the fruit image that do not contain data.
illness - region - -
gangrene - region - -
mould - region - -
blemish artificial region + boolean - -
dark_style_remains - region - After pollination the remains of style are preserved in the fruit. A dark area around the remain of style indicates an unhealthy fruit. This place is the region from which the fruit starts rotting or catches mould.
pedicel - region - Pedicel refers to a structure connecting a single flower to its inflorescence.
artifact - region - Image contains artifacts i.e. regions that are not related to a fruit and are a result of wrong image processing. Those regions should be identified and described.

File name

You will notice that file names are composed to form a specific identifier e.g.: 0037_G_I_120_A: 0037 (individual fruit instance), 120 (relative photo angle), A (photo position). Some of them are restricted to the original project and cannot be published.

Download data

File Format Version
Lemon Dataset COCO v1

COCO API can be utilized to read the dataset.

from pycocotools.coco import COCO

coco = COCO('../lemon-dataset/annotations/instances_default.json')

Citing

If you use the lemons data set in a scientific publication, we would appreciate references to the following paper:

Biblatex entry:

@misc{softwaremill_2020,
  author       = {Maciej Adamiak},
  title        = {Lemons quality control dataset},
  institution  = {SoftwareMill},
  month        = jul,
  year         = 2020,
  doi          = {10.5281/zenodo.3965568},
  url          = {https://github.com/softwaremill/lemon-dataset}
}

License

MIT License

Copyright (c) 2020 SoftwareMill

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

About

Lemons quality control dataset