biolab / Discovery-Science-2023

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DOI

Discovery Science 2023: Gene Interactions in Survival Data Analysis: A Data-driven Approach Using Restricted Mean Survival Time and Literature Mining

This repository has all the scripts and supporting data we used to analyze and generate the figures. The TCGA datasets are too big to store them here. To reproduce the results, please follow this guide to download the datasets and store them in the data folder.

Scripts for calculating interactions and permutations (note that this takes a considerable amount of time and compute resources).

To analyze our results, they are available in computed_interactions. We have a separate .csv containing results for all interaction types for each dataset (note that permutation tests are not included because of the size limit).

Notebooks folder contains notebooks used to generate the figures and different types of analysis presented in the paper. If you have trouble running the notebooks, feel free to contact us (via email or repository issue tracker).

Finally, the implementation of the method to calculate interactions is located in method folder.

Literature summaries for all interactions types across all of the datasets

Dataset Additive (+) Competing (-) XOR (*)
METABRIC gpt3-summary gpt3-summary gpt3-summary
BLCA gpt3-summary gpt3-summary gpt3-summary
BRCA gpt3-summary gpt3-summary gpt3-summary
CESC gpt3-summary gpt3-summary gpt3-summary
COAD gpt3-summary gpt3-summary gpt3-summary
GBM gpt3-summary gpt3-summary gpt3-summary
HNSC gpt3-summary gpt3-summary
gpt4-summary
gpt3-summary
KIRC gpt3-summary gpt3-summary
gpt4-summary
gpt3-summary
KIRP gpt3-summary gpt3-summary gpt3-summary
LAML gpt3-summary gpt3-summary gpt3-summary
LGG gpt3-summary gpt3-summary gpt3-summary
LIHC gpt3-summary gpt3-summary gpt3-summary
LUAD gpt3-summary gpt3-summary gpt3-summary
LUSC gpt3-summary gpt3-summary gpt3-summary
OV gpt3-summary gpt3-summary gpt3-summary
PRAD gpt3-summary gpt3-summary gpt3-summary
READ gpt3-summary gpt3-summary gpt3-summary
SKCM gpt3-summary gpt3-summary gpt3-summary
STAD gpt3-summary gpt3-summary gpt3-summary
THCA gpt3-summary gpt3-summary gpt3-summary
UCEC gpt3-summary gpt3-summary gpt3-summary

We used the following prompt:

You are a helpful domain expert with a background in biology. You know the biology of each genes known in the literature. 

Cancer type: TCGA-{cancer_type}
Genes: {gene1} and {gene2}. 

BioGRID protein interaction network; shortest path between {gene1} and {gene2}: {paths} .

Context: {context}

Based on what you know about these two genes and provided context. Describe briefly what specifically these genes do. 
Can you reason about any possible functional associations between these two genes in specific biological terms? 
Use context and your knowledge about biology to answer the question. Be specific in the processes where these genes are involved.

Be concise. Answer in 2-3 short sentences. Start with possible functional associations. 

"""

Permutation test results

drawing



drawing



drawing



drawing



drawing



drawing



drawing



drawing



drawing



drawing



drawing



About


Languages

Language:Jupyter Notebook 95.9%Language:Python 4.1%