SpSeg

'Species Segregator' or SpSeg is a Machine-learning tool for species-level segregation of camera-trap images originating from wildlife census and studies. SpSeg is currently trained for the Central Indian landscape specifically. The model is build as second-step to Microsoft's MegaDetector, which identifies animals, person and vehicle in these images. SpSeg reads the results of MegaDetector and classifies the animal images into species (or a defined biological taxonomic level).

Kindly note:

The tool is currently under developement and the instruction for installation and use on a new data are not shared yet. One can use the current 'environment_multimodel.yml' to setup an Anaconda environment and find the models from a publicly shared SpSeg_Models Google Drive folder to run on a new dataset at their own risk. There is no need to setup a separate MegaDetector environment, which is incorporated in the codes here. However, MegaDetector model v4.1.0 is required to obtain images with 'Animal' tags and the bounding boxes.

We further plan to 1) train a couple of EfficientNet models with PyTorch, 2) finalize model based on the top performing models in a multi-model approach, and 3) share the tools to use in camera-trap studies in practical ways.

Results of initial trained models

The models in different architectures were trained for 100 ephocs each with the same training and test dataset. So far we have achieved the highest test accuracy for ResNet152v2 and InceptionResNetv2 at 89.2%.

Architecture	avg top-1 acc	Architecture	avg top-1 acc
Xception	88.9%
VGG16	3.4%	VGG19	3.3%
ResNet50	88.5%	ResNet50v2	87.5%
ResNet101	88.8%	ResNet101v2	89.1%
ResNet152	82.0%	ResNet152v2	89.2%
InceptionResNetv2	89.2%

Training data

Training dataset includes 36 species commonly encountered in camera-trap surveys in Eastern Vidarbha Landscape, Maharashtra, India:

Species	Scientific name	Image set	Species	Scientific name	Image set
00_barking_deer	Muntiacus muntjak	7920	18_langur	Semnopithecus entellus	12913
01_birds	Excluding fowls	2005	19_leopard	Panthera pardus	7449
02_buffalo	Bubalus bubalis	7265	20_rhesus_macaque	Macaca mulatta	5086
03_spotted_deer	Axis axis	45790	21_nilgai	Boselaphus tragocamelus	6864
04_four_horned_antelope	Tetracerus quadricornis	6383	22_palm_squirrel	Funambulus palmarum & Funambulus pennantii	1854
05_common_palm_civet	Paradoxurus hermaphroditus	8571	23_indian_peafowl	Pavo cristatus	10534
06_cow	Bos taurus	7031	24_ratel	Mellivora capensis	5751
07_dog	Canis lupus familiaris	4150	25_rodents	Several mouse, rat, gerbil and vole species	4992
08_gaur	Bos gaurus	14646	26_mongooses	Urva edwardsii & Urva smithii	5716
09_goat	Capra hircus	3959	27_rusty_spotted_cat	Prionailurus rubiginosus	1649
10_golden_jackal	Canis aureus	2189	28_sambar	Rusa unicolor	28040
11_hare	Lepus nigricollis	8403	29_domestic_sheep	Ovis aries	2891
12_striped_hyena	Hyaena hyaena	2303	30_sloth_bear	Melursus ursinus	6348
13_indian_fox	Vulpes bengalensis	379	31_small_indian_civet	Viverricula indica	4187
14_indian_pangolin	Manis crassicaudata	1442	32_tiger	Panthera tigris	9111
15_indian_porcupine	Hystrix indica	5090	33_wild_boar	Sus scrofa	18871
16_jungle_cat	Felis chaus	4376	34_wild_dog	Cuon alpinus	7743
17_jungle_fowls	Includes Gallus gallus, Gallus sonneratii & Galloperdix spadicea	4760	35_indian_wolf	Canis lupus pallipes	553

Training pipeline

SpSeg repository contains all the required tools to train and test the model. Run the codes from Tools directory in the repository

Step 1: Run MegaDetector model on Images to separate animal images. Latest model V4.1 can be downloaded from here.

python run_tf_detector_batch.py path_to_model/md_v4.1.0.pb image_directory image_directory/output_file.json

Step 2: Crop the the animals in sqaure images.

python crop_detections.py image_directory/output_file.json path_to_crops --images-dir image_directory --detector-version "4.0" --threshold 0.8 --logdir "." --threads 25 --square-crops

Step 3: Create CSV file with paths to each image in the directory alongwith a numrical identifier of the species.

python csv_paths.py --image_folder path_to_crops --image_format jpg --output_csv ../paths/species_data.csv --net_type cnn

Step 4: In windows systems, sometimes file paths do not have required extension at the end (‘file_01.’ instead of file_01.jpg). This steps removes these paths from the data (should be used cautiously).

python test_files.py --input_csv ../paths/species_data_test.csv --output_csv ../paths/species_data_cleaned.csv

Step 5: Since the number of images varry in each species class, we restricted sample size at 5000 images max for each class by randomly undersampling.

python random_sampling.py --input_csv ../paths/species_data_test.csv --output_csv ../paths/species_data_usample.csv --sample_size 5000

Step 6: Split the dataset into train, testing and validation sets. Validation data size is set to be equal to testing data size.

python split_dataset.py --input_csv ../paths/species_data_usample.csv --output_dir ../paths/ --file_name species_data --test_per 0.15

Step 7: Training CNN models from keras https://keras.io/api/applications/

python train_cnn.py --model model_name --train_csv ../paths/species_data_train.csv --valid_csv ../paths/species_data_valid.csv --batch_size 10 --num_classes 37 --epochs 100 --input_shape 224 224 3

Step 8: Calculate accuracy of CNN models

python accuracy_cnn.py --model model_name --input_shape 224 224 3 --csv_paths ../paths/species_data_test.csv --weights ../trained_models/model_file.hdf5

Dr Bilal Habib's lab at Wildlife Institute India, Dehradun, India is a partner in the development, evaluation and use of MegaDetector model. Development of SpSeg was supported by Microsoft AI for Earth (Grant ID:00138001338)

venkanna37 / SpSeg

SpSeg

Kindly note:

Results of initial trained models

Training data

Training pipeline

About

Languages