Natalia Rosa's repositories
bcftools
This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
samtools
Tools (written in C using htslib) for manipulating next-generation sequencing data
bwa
Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
oncokb-annotator
Annotates variants in MAF with OncoKB annotation.
vcf2maf
Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms
Full-Stack-Flask-and-React
Full-Stack Flask and React, published by Packt
Causal-Inference-and-Discovery-in-Python
Causal Inference and Discovery in Python by Packt Publishing
graphiql-explorer
Explorer plugin for GraphiQL
CARNIVAL
CAusal Reasoning for Network Identification with integer VALue programming in R
drug.OnTarget
Identifying drug targets by integrating large-scale drug and genetic screens.
lollipops
Lollipop-style mutation diagrams for annotating genetic variations.
chain-event-graphs
R and Python scripts for my Summer 2021 undegraduate research project on Chain Event Graphs as part of the URSS scheme
peprmint-web
Web-tool for calculating and visualizing hydrophobic protrusions
sigflow
Sigflow: Streamline Analysis Workflows for Mutational Signatures
An-explainable-model-of-host-genetic-interactions-linked-to-Covid19-severity
This project focused on the mapping of the host-genetics factors determining COVID-19 severity using Machine learning approaches (supervised, unsupervised Machine learning methods, Pathway signaling processes, and Open Targets web-based Platform). Our study utilized the whole-exome sequencing genome dataset of 2000 European descent patients collected from the GEN-COVID Multicenter Study group (https://clinicaltrials.gov/ct2/show/NCT04549831) coordinated by the University of Siena. The whole-exome genome sequencing dataset contained 1.057M genetic variants of the patients. We used the 2000 patients’ original phenotype information to filter only patients with severity and asymptomatic across all classification criteria (841 patients). We introduced an innovative variant screening strategy that applied K-stratified fold splits of the original dataset to randomly draw a unique 5-fold pool of variants using the patients’ original phenotype information (841 unique patients).
esm
Evolutionary Scale Modeling (esm): Pretrained language models for proteins
toastr
Simple javascript toast notifications
precog
An ML-based predictor of GPCR/G-protein couplings using only sequence information
seedoo-core
Core piattaforma Seedoo
COVID-19
COVID-19 Italia - Monitoraggio situazione
article-resources
A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
awesome-single-cell
List of software packages for single-cell data analysis, including RNA-seq, ATAC-seq, etc.
pdf_reports
:closed_book: Python library and CSS theme to generate PDF reports from HTML/Pug
ceg
chain-event graph for R. This package implements the theory of CEG. Generate and plot CEG objects from manual input or from formatted data.
hmmer
HMMER: biological sequence analysis using profile HMMs