IDE TREE
In this mini project, i implemented a decision tree algorithm and applied it to molecular biology.
A Promoter is a region of DNA that facilitates the transcription of a particular gene. Promoter compilations and analyses have led to computer programs which predict the location of promoter sequences on the basis of homology either to the consensus sequence or to a reference list of promoters. Such programs are of practical significance in searching new sequences.
The data here is provided by UCI Machine learning repository and can be found in Molecular Biology. It contains 106 instances with 57 features. The data is split randomly into traning and validation sets for the ID3 decision learner. More information on this algorithm can be found on Chapter 3 of Mitchell (Machine learning book).
Also checkout : http://www.cise.ufl.edu/~ddd/cap6635/Fall97/Shortpapers/2.htm
How to run : python testPromoter.py training.txt validation.txt
-Khabbab Saleem