Link to Dataset :- http://www.vision.caltech.edu/Image_Datasets/Caltech101/ on which these scripts were implemented.
It basically gets features from all the objects present in the dataset from the Train Script and then an indivdual image containing any object when tested throught the Test Script yields the label which then gets converted to speech.
This works for all images captured by us and not only for images local to dataset.