Leandro de Mattos Pereira's repositories
Annotation-genome-graphics
R scripts for genome annotation graphs
Microbiota-MetaAnalyser
Microbiota Coral 16S MetaAnalyser
Aminoacid-cost-of-protein
genomic analysis
applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
awesome-AI-based-protein-design
A collection of research papers for AI-based protein design
BioAutoMATED
Automated machine learning for analyzing, interpreting, and designing biological sequences
deept2
DeepT2 utilizes deep learning techniques to identify type II polyketide (T2PK) synthases KSβ and their corresponding T2PK product within bacterial genomes. The method leverages ESM2 to transform KSβ sequences into embeddings, which are employed to train two separate classifiers using multi-layer perceptron for both KSβ and T2PKs classification.
druggpt
DrugGPT: A GPT-based Strategy for Designing Potential Ligands Targeting Specific Proteins
evodiff
Generation of protein sequences and evolutionary alignments via discrete diffusion models
exbert
A Visual Analysis Tool to Explore Learned Representations in Transformers Models
FasterTransformer
Transformer related optimization, including BERT, GPT
GeneGrouper
CLI tool for finding gene clusters in many genomes and placing them in discrete groups based on gene content similarity.
GPT_protein_design
Efficient protein de novo design pipeline with GPT-based generator and transfer learning-based discrminator
helm-gpt
HELM-GPT: de novo macrocyclic peptide design using generative pre-trained transformer
HGTector
HGTector2: Genome-wide prediction of horizontal gene transfer based on distribution of sequence homology patterns.
MJPythonNotebooks
Visualizing gene tree conflict using Phyparts, and ETE3
NeuralPLexer
NeuralPLexer: State-specific protein-ligand complex structure prediction with a multi-scale deep generative model
openai-cookbook
Examples and guides for using the OpenAI API
papers_for_protein_design_using_DL
List of papers about Proteins Design using Deep Learning
ProstT5
Bilingual Language Model for Protein Sequence and Structure
protein_scoring
Generating and scoring novel enzyme sequences with a variety of models and metrics
ProtTrans
ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit and hundreds of Google TPUs using Transformers Models.
RFdiffusion
Code for running RFdiffusion
rodeo2
This isn't our first RODEO. The new and improved RODEO is written in Python and supports lasso peptide, class I lanthipeptide, sactipeptide and thiopeptide precursor prediction.
start-llms
A complete guide to start and improve your LLM skills in 2023 with little background in the field and stay up-to-date with the latest news and state-of-the-art techniques!