leticiabragas2 / IBI5031BreastCancerML

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

IBI5031BreastCancerML - Classification of different breast cancer subtypesusing supervised machine learning techniques

This work intends to test different feature selector toghether with different machine learning models to find the best combination.

Here we store all files used in the analysis and a copy of the article. This was a work for the graduation in bioinformatics subject IBI5031 - Machine Learning for Bioinformatics offered by Interunites in Bioinformatics of Universidade de São Paulo (USP) - SP, BR.

The short description of the files follows:

  • IBI5031Group6Notebook: is the core file with all codes displayed and the step by step analysis used. It servers to assure reproducibility of results
  • Script_TCGAbiolinks: is the script used to retrive data from The Cancer Genome Atlas Program (TCGA) used in our analysis
  • normalization_and_filtering: is the scripted used to chose the quarter more relevant genes for analysis
  • DESeq2_Script: is the script tha perform differentialy expression analysis between cancêr tissues and normal tissues, one of our feature selection methods used
  • EnrichmentAnalysis: is the script used to analyse commom pathways of genes selected by the best feature selector method
  • FinalArticle: presents the last version of the article submitted to grade

About

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 98.8%Language:R 1.2%