Yiwen-Yang-666 / Survival-Analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This is a project related to survival analysis in high dimension feature space. In the literature, there are 3 types of methods (i.e. unsupervised, depending on the clinical data , and semi-supervised) to identify subtypes of survival data (the responses of data are not completed and the responses are multiple ). We investigated and evaluated some of the most important state-of-the-art approaches. In addition, we made improvements on the semi-supervised approaches and the methods based on clinical data. We also developed a model to test whether the methods based on clinical data were valid on our gene data.

Semi-supervised:

---Improved SPC (supervised principal components) and used the improved version to identify subtypes of the survival data.

---Applied supervised clustering and nearest shrunken centroids to identify subtypes of the survival data

Methods Depending on the clinical data:

---Adopted median cut and train nearest shrunken Centroids to identify subtypes of the survival data..

---Utilized the Kaplan Meier to create class labels, then developed a model that apply MRMR (minimum-redundancy maximum – relevance) feature selection and Bayesian Classification to identify subtypes of the survival data.

Methods Depending on Unsupervised:

---Implemented unsupervised learning (e.g. hierarchical clustering) to create class labels and trained nearest shrunken Centroids to identify subtypes of the survival data.

Impact:

---The improved version of SPL obtained the best performance among all algorithms.

---The semi-supervised algorithms overpowered the other two types of approaches.

---Concluded strengths and weaknesses of each type of method.

About


Languages

Language:R 81.8%Language:C++ 18.2%