Data Splitting
shonenkov opened this issue · comments
Hello! Thank you all for great hard work especially for creating this dataset!
Could you provide data splitting from your article? https://arxiv.org/pdf/2008.05373.pdf
(image ids for Validation_images, Test_1_images and Test_2_images)
It would be very useful for comparison and publishing other articles and citing of your original article and dataset :)
thank you for your consideration,
we already did splitting for our dataset you can find folders for split data in the dataset folder
also, this link is an old version of my paper you can find the published version in this link https://www.mdpi.com/2313-433X/6/12/141
Should you require further assistance or have other queries, please do not hesitate to contact me.
In the dataset folder I found this annotation for image from example "0_10_23.jpg":
{"size":{"width":517,"height":63},"moderation":{"isModerated":1,"moderatedBy":"Norlist","predicted":""},"description":"Слепым волчатам","name":"0_10_23"}
but here I didn't find information about splitting. could you help me to find it?
thank you in advance!
sure I will send python code to split the dataset. sorry, I thought it split in the upload folder but when I asked my supervisors. they told me it is not split because maybe researchers want to split it as they like. I will send you the link to split the dataset
this link you can use to split the dataset
https://github.com/bosskairat/Dataset
thank you, I run your code and got this splitting:
Could you add this csv (after unzip) in Cloud for everyone ??? https://cloud.mail.ru/public/25xw/2YPdtaFAF
usage:
import pandas as pd
df_splitting = pd.read_csv('HKR_splitting.csv', index_col='id')
df_splitting['stage'].value_counts()
>>>
train 45559
val 9375
test2 5043
test1 4966
Name: stage, dtype: int64
thank you!
Check repository of python code we have already updated it