- Design, develop and implement data-driven products using right mix of skills and technologies for Data Engineering, Data Scientists and Data Analysts.
- [Building a Data Science team]
- [Ml Problem Framework]
- [Business Problem Framework]
- Responsible for AI-driven Transformative Innovation for the largest Healthcare and Pharmacy Service Provider.
- Conceptualized and initiated the development of products like Enterprise Medical Insights Discovery through Advanced NLP and Knowledge Graph.
- Spearheaded the extensive Due-diligence and Business-Impact analysis which eventually led to Multi-disciplinary initiatives for Prior-Authorization analysis, Claims analysis, Clinical Notes, Prescription and Provider content Analysis.
- Designed and initiated development of integrated Orchestrated ML Workflow and ML Ops Solutions from scratch in the domain of Care Management, Member Analytics, Pharmacy Business and Next-Best-Action (NBA) Campaigns for reducing medical cost and personalized recommendations.
- Leading multiple teams to deliver results through extreme collaboration with stakeholders, deep understanding of the business and market, devising new growth strategies.
- Conceptualized and implemented products in the domain of Marketing Decision Science to combine billions of signals from User Engagement data, User Intents, Product feed and Site page content using Recommendations, Clustering, Classification and advanced NLP techniques.
- Built systems to crunch petabytes of data to support $4B revenue
- Developed a Customer Centric, Intents driven Business model for optimizing Search marketing and Search Engine performance.
- Designed and developed Experimentation platform (SEO Split Tests)
- Unified Smart City data model, Unified ML & Data Processing Workflow on AWS for Traffic Analysis, Parking Prediction & Accident Prevention. AWS Kinesis , IOT, DynamoDB, EMR, Glue, SageMaker
- Our design proposal was used by San Jose City and presented at Amazon’s largest conference Reinvent 2018:
- Solution Coverage:
IIOT: Award winning Real time Location Tracking (RTLS) product 'Where' for the startup Enlighted Inc
- Founding lead of the team. Conceptualized and developed the first version of the pipeline , real-time algorithms and rules engine for RTLS (Real time Location Tracking) & Daylight Savings which was the first of its kind in Industry, leveraging homegrown sensors.
- Developed a sub-mili-sec C algo wrapper within a streaming pipeline for tracking location of moving objects in a building.
- Extremely fast Inference on streaming data and parallel anomaly detection. The product installation and tuning for customers like Kaiser (RTLS in Hospital) demonstrating the product to potential buyers and played a crucial role in successful acquisition through continuous tuning of the pipeline.
- [My work was presented in Healthcare Conferences](https://www.enlightedinc.com/blog/enlighted-will-be-at-himss19/ and covered in https://www.facebook.com/EnlightedInc/videos/811082362618397/)
- As one of the Founding members and Lead Data Scientist of Tesseract (Cisco Spin-off) , got an opportunity to build the Network Graph showing interaction between consumer, device and applications along a timeline with continuous stats , alerts and predictions for proactive network performance assurance and threat detection (Digital Networking Architecture)
- Developed ANTLR-based parser to create a free-form query to traverse domain Graph and Titan-based temporal networks. Augmented the Domain vocabulary using WordNet NLS
- Time-series prediction , create Clusters, detect communities and anomalies at graph Nodes.
- Customer Success Story and Demonstration https://www.cisco.com/c/dam/m/en_ph/connect/pdfs/introducing-dna-assurance-jason-pernell-final.pdf, https://www.cisco.com/c/en/us/solutions/enterprise-networks/network-architecture-customer-success-stories.html#~stickynav=1
- Reference to Network Insights Graph - Property Graph Model -- https://www.cisco.com/c/dam/m/da_dk/training-events/2017/network-intuitive/pdf/TNI-Tech-DNA-Assurance-Analytics-Soren.pdf
- More details on Proactive Network Troubleshooting
- https://www.cisco.com/c/en/us/td/docs/cloud-systems-management/network-automation-and-management/dna-center/1-2-5/user_guide/b_dnac_ug_1_2_5/b_dnac_ug_1_2_4_chapter_010100.html#id_73504
- https://www.cisco.com/c/en/us/td/docs/cloud-systems-management/network-automation-and-management/dna-center/1-2-5/user_guide/b_dnac_ug_1_2_5/b_dnac_ug_1_2_4_chapter_010100.html%23id_73504)
Cloud Analytics: Founding member of the product DCBA (Dell Cloud Business Applications) & Kitenga incubated by Dell Silicon Valley RnD Center
- Developed one of the first of its kind Machine Learning Workflow Suite (Kitenga) with Data Processing operators for ETL (pig/hive) Clustering, Entity Recognition, Solr Search, Sentiment Analysis etc.
- Designed and developed patented 'Application Streaming Platform as a Service' and 'Analytics as a Service' Cloud Business Integration and - - Business KPI measurement application (3 patents)
- Demonstration of Machine Learning Work: https://www.youtube.com/watch?v=0POUx6NJqPw , https://www.youtube.com/watch?v=xZ7i-yZoX6o
- Dell Business Cloud Analytics - https://www.youtube.com/watch?v=K0ibPEXwwhI ,
- Workflow on Cloud - https://www.youtube.com/watch?v=UpYKN78Cf5g
http://mdc.jackbe.com/prestodocs/v3.2/mashup-studio/emml-editor.html
Development Tools: Conceptualization and development of Graphical Model Transformation Framework which was adopted as Business Object Modeling Tool by Tibco BPM
https://docs.tibco.com/pub/business-studio-bpm-edition/4.1.0/doc/html/GUID-C7186329-3F2F-4FA5-85A8-C356519618D6.html
http://media.tibco.com/flash/soa/tibco_soa_preso.html
Core contributor and owner of these tools which generated massive revenues. https://docs.tibco.com/pub/businessevents-express/5.2.1/doc/html/GUID-791378AB-73F6-417D-BFBE-AA64192D95AE.html https://docs.tibco.com/pub/businessevents-enterprise/5.5.0/doc/html/GUID-C6901441-911D-4257-891C-D8BEF9F1CD0E.html
Product Idea - Pill Reminder Product Idea - Drugs Digital Ads Product Idea - Pain Management Drugs Search - Web Analytics and Phone App
-- Spotle.AI: Board Member -- involved in non-profit initiative to advise company, teach Deep Learning and help students with ML projects
-- PrivaClave: Advisor, honorary CTO -- Mentoring the development of Graph DSL and Content Rule mining
-- AIForMankind -- Lead a group of volunteers to implement projects for social goods like Wildfire Detection: -- 2nd prize winner for 'Wild Fire detection' hackathon: -- Launched Covid19 Hackathons -- Image Classification and Object Detection Techniques -- NLP Techniques
-- Code, papers, blogs --Open Source Contributions -- Project COVITA -- Covid-19 Text Analysis for Social Good -- Covid-19 Technical Research -- Project Skylab -- Early detection of Wildfire -- Wildfire detection and mitigation using deep-learning -- Project Spark-NLP -- Algorithm Performance Analysis -- Notes on Selecting ML Algorithms -- Developed a weka‐based tool to compare ML-Algo performances -- contributed to Open Source ‘randomized optimization’ library --Algo Analysis Papers -- Supervised Algo -- Unsupervised Algo -- RL Algo -- Randomized Optimizations Algo -- Regression Modeling in R (Logistics & Linear Regressions) -- Exploratory Analytics
-- Online Shopping Experience -- Opportunities for innovation in eCommerce -- Data Science of Selling Product -- Computational Investments -- Algorithmic Strategies -- Predictive Trading -- Code using Python, H2O ML lib
-- Medical Data Analysis
-- MedBots (Drugs Data Analysis and Search)
-- Drugs Data Mining Article
-- Drugs Data Mining Code
-- Lab Setup for HealthCare Data Analysis (Spark-ML, python, scala)
-- Mortality Prediction
-- Patients Disease Clustering
-- Analyze FHIR
-- Medical Entity extraction using Spark NLP
-- Knowledge-Based AI techniques
-- AI Agent teaching human
-- AutoGrader Agent
-- Machines Solving Psychometric Test
-- Robotics & IOT
-- Location Tracking, Kalman Filters, GBR in Robotics
-- Building a Smarter Globe
-- Object Tracking in Constrained Environment
-- Image Detection and Sentiment Analysis
-- Comprehensive guidelines for Image Analysis & Object Detection
-- Wildfire Object detection
-- mnist data analysis (Lin Reg - R)
-- Image Detection and Sentiment Analysis (Deep Learning) - Python-Keras
-- Course Recommendation
-- Apriori Analysis for Course Recommendation
-- Apriori code example
-- NLP -- Comprehensive research on NLP techniques -- Flu outbreak prediction using Twitter texts -- Twitter / FB text analysis for Entity Recognition (Gate API) -- Product Review content analysis (Standford NLP) and topic clustering -- Product Category and Movie Genre Analysis (R - Document Term Matrix) -- Network Vocabulary matching (tensorflow library for term analysis) -- Drugs data extraction (Solr Rank) and boost rank of drugs attributes -- Drugs Annotation using UIMA analysis library -- Text message analysis (near term matching, dis-max parser) -- Misc -- Timeseriese Anomaly -- Graph Analysis -- Graph Query and Domain Vocabulary -- TensorFlow Application -- Product Recommendation, Review Data Analysis -- Movies Gross Prediction (Linear Regression)
-- Latest Tools for Machine Learning Workflow Orchestration, Data Labelling, Model Deployment, Tuning, Tracking & Automation -- Making models explainable, interpretable and Ethical -- Building Machine Learning Pipelines (2016) -- Machine Learning Platform & Infrastructure (2016) -- ML-2015 ~Data Science Usecases
-- Evolution of Streaming Technology -- Streaming ETL-Part 3 ~ Apache Beam,Spark Streaming,Kafka Streams,MapR Streams -- Apache-Edgent-IOT-Analytics -- What's a Streaming Data Management System ? -- Streaming ETL-Part 2 ~ Develop a fault-tolerant, scalable messaging and streaming solution (Kafka, Samza, SQL-Stream , DataTorrents) -- Build a Micro-batch Streaming and Ingestion System -- Streaming ETL -Part 1 ~ Real-time ETL on Hadoop
-- Data Modeling best practices-Part 2 -- Statistical Data Analysis using Time Series DB -- Graph Modeling and Query -- Fast Data Transfer using Parquet, Arrow -- Gremlin Domain Specific Traversal -- Data Modeling Reference-Part 1 -- Lightweight Persistence Service for Dynamic Entities -- Using high-speed cache
-- Big Data Analytics - core concepts & key technologies -- MS CS Course ~ Big Data Project for HealthCare Informatics -- Hands-On Spark -- [Analytics Part-3 ~ Ad-hoc Big Data Crunching using Couchbase, Spark SQL, Solr-Spark] -- Continuous Automated Actionable Analytics -- Advantages of ColumnStore -- Optimize Query for old-fashioned Data-warehousing -- Analytics Part-4 ~ Thinking beyond Hadoop : build Fault-tolerant High-throughput low-memory Real-time analytics -- Analytics Part-5 ~ Ad-hoc Analytics on Big Data : Key Technologies -- Analytics Part-2 ~ Approaches for querying current and historical data -- Git repo ~ real-time analytics app using - NodeJs‐Mongo‐Redis‐Socket.io -- Git repo ~ real-time rules execution using logstash -- Git repo ~ real-time triggers using ES Percolator -- Git repo ~ Spark Analytics -- Git repo ~ crawlers (FB, Twitter, Linkedin) -- Data Analysis using R
-- Data Integration Tricks -- Application Integration Platform as a Service
-- Building an Enterprise Search Application using SOLR and CASSANDRA -- SOLR Ranking and Query Boosting -- Build a Search APP using MongoDB , github repo
-- step-by-step approach to build a multi-tenant saas analytics system (2012) -- Multi-Tenancy in SaaS Modules -- ETL for SaaS System -- RnD Code ~ build a simple data ingestion framework
-- Big data Analysis in AWS EMR -- Programming with Cloudera Hadoop 4 (Ubuntu) -- git repo ~ Spring-Hadoop App
-- Coding Tricks to build robust systems -- Java Reloaded -- Designing better API -- Taming Server-side threads -- Best Practices for developing Scalable System -- Learning from JDK Source and Language Specs
-- Tuning Titan Graph -- Tuning Kafka -- Tuning ElasticSearch -- Tuning Hadoop Stack -- Challenges in streaming large data set -- Tuning Hbase -- Tuning JVM -- Tuning MySQL
-- Create a Spark App -- Build a Dynamic Workflow using MEAN/MERN -- Json Search App using MongoDB -- Product Overview Dashboard -- Medical Data Analysis using FHIR
-- Secure Access to MongoDB using bcrypt -- Enforce Security checks on public API -- Securing Application Access
-- Building MicroService -- [Run Apps using Mesos & Docker] -- MongoDB Fault-Tolerance Strategy -- git repo: Failover tests, Split-brain scenario -- Setup LAMJ
-- ReactJs -- Learning MEAN Stack
-- Bigdata Reference (2012) -- Building a Domain-Specific Graphical Modeling Tool -- Building a Startup Environment -- Key Features of Ad Serving Technology -- Coding with CloudFoundry -- Riding Semantic Wave
-- Blogs(http://wp.me/p2fcIs-3R)