dylanabramson33 / DAGFlow

My take on directed acyclic graph library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DAGFlow

What is DAGFlow?

DAGFlow is a tool to create and manage directed acyclic graphs for training and deploying machine learning models.

Why Did I create DAGFlow?

I noticed myself creating lots of duplicated functionality across many ML projects. With DAGFlow, I hope to create a framework that encourages modular processing/querying for easy reuse of components across many projects. I also wanted a low-code solution to allow deployment of ML pipelines for scientists without webdev experience.

How it works

The core object in DAGFlow is the "flow"'. A flow defines a sequence of transformations of data through "Nodes". Below is an example flow that takes an input dataframe with a SMILES field (a string representation of molecule) and adds a column containing the bond adjacency matrix of the molecule:

from Flows import createflow
from Node import nodify
from rdkit import Chem
import pandas as pd

@nodify(node_type='Source', fields={'SMILES' : 'SMILES'})
def ChemCSVReader(inp : str) -> pd.DataFrame:
    df = pd.read_csv(inp)
    return df

@nodify(requires=['SMILES'],adds=['MOLS'])
def ChemAddMol(inp : pd.DataFrame) -> pd.DataFrame:
    inp['MOLS'] = inp['SMILES'].apply(Chem.MolFromSmiles)
    return inp

@nodify(requires=['MOLS'],adds=['GRAPH'])
def GetMoleculeGraph(inp : pd.DataFrame) -> pd.DataFrame:
    inp['GRAPH'] = inp['MOLS'].apply(Chem.rdmolops.GetAdjacencyMatrix)
    return inp
    
@createflow
def test_flow():
   df = ChemCSVReader('./solubility_data.csv')
   df = ChemAddMol(df)
   df = GetMoleculeGraph(df1)
   return df


test_flow.compile()
output = test_flow.run()

GUI

No DAG library is complete without a slick GUI. DAGFlow includes DAGWeb, a flask app for creating DAG's with a drag and drop UI. Here is the same flow as above represented in the GUI.
Screen Shot 2022-01-08 at 2 19 58 PM

The GUI contains a basic type checker (right now type annotations must be included but this will be optional in future), ensuring only nodes of compatible types can be linked together.

About

My take on directed acyclic graph library


Languages

Language:JavaScript 99.9%Language:Python 0.1%Language:HTML 0.0%Language:CSS 0.0%