nPath, a path and pattern analytic function, now runs natively in Teradata 16.20. In addition to looking for common paths to and from key pages, we can use nPath to build first order markov chains in a sigle pass of the data. Markov chains allow us to understand the probability of moving from one event to another; we can take the product of probabilities for a given series of event to say "how likely" it was to occur in the model. We can also build multiple models for different segments of the data to say which segment a new series is most like, and which event we are most likely to see next.
This repo includes code and a small sample set for doing exploritory path analysis, building markov chains off of time series data, and applying these models to some common use cases.
- nPath runs natively in Teradata 16.20, this database version is necessary to run this code
- The code in this repo can be modified to run on any time series data which has at least the following 3 columns: session identifier, event, order of events in the session. This small sample set will allow the repo code to run as-is.
- Michelle Tanco - Initial work - michelle.tanco@teradata.com
- Props to Russell Ratshin for the data set