biosemble
is a Python natural language processing (NLP) software program for assembling biological wordnets from structured and unstructured biological text. Structured text includes resources like biologically relevant dictionaries and encyclopedias, while unstructured text includes biologically relevant textbooks.
biosemble
can autonomously identify leukemia as a blood cancer, and CD38 as a glycoprotein on the cell surface that is relevant to leukemia:
Not too bad!
biosemble
uses part-of-speech (POS) tagging to assemble similar words across a wide array of biologically relevant dictionaries and encyclopedias.
biosemble
uses Word2Vec which is a Neural Network based algprithm to produce a group of related models that are used to produce word embeddings. Using biosemble
you can pass in your custom argumetns based on the input data, required to generate the most precise results.
Coming soon!