patrickzib / leitmotifs

Discovering Leitmotifs in Multidimensional Time Series

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Discovering Leitmotifs in Multidimensional Time Series

This page was built in support of our paper "Discovering Leitmotifs in Multidimensional Time Series" by Patrick Schäfer and Ulf Leser.

A leitmotif is a recurring theme in literature, movies or music that carries symbolic significance for the piece it is contained in. When this piece can be represented as a multi-dimensional time series (MDTS), such as acoustic or visual observations, finding a leitmotif is equivalent to the pattern discovery problem, which is an unsupervised and complex problem in time series analytics. Compared to the univariate case, it carries additional complexity because patterns typically do not occur in all dimensions but only in a few - which are, however, unknown and must be detected by the method itself. In this paper, we present the novel, efficient and highly effective leitmotif discovery algorithm LAMA for MDTS. LAMA rests on two core principals: (a) a leitmotif manifests solely given a yet unknown number of sub-dimensions - neither too few, nor too many, and (b) the set of sub-dimensions are not independent from the best pattern found therein, necessitating both problems to be approached in a joint manner. In contrast to all previous methods, LAMA is the first to tackle both problems jointly - instead of first selecting dimensions (or leitmotifs) and then finding the best leitmotifs (or dimensions).

Supporting Material

  • tests: Please see the python tests for use cases
  • notebooks: Please see the Jupyter Notebooks for use cases
  • csvs: The results of the scalability experiments
  • leitmotifs: Code implementing multidimensonal leitmotif discovery using LAMA
  • datasets: Use cases in the paper

Leitmotif Use Case

A leitmotif (leading motif) is a recurring theme or motif that carries symbolic significance in various forms of art, particularly literature, movies, and music. The distinct feature of any leitmotif is that humans associate them to meaning, which enhances narrative cohesion and establishes emotional connections with the audience. The use of (leit)motifs thus eases perception, interpretation, and identification with the underlying narrative. A genre that often uses leitmotifs are soundtracks, for instance in the compositions of Hans Zimmer or Howard Shore. The above figure shows a suite from The Shire with 14 channels arranged by Howard Shore for Lord of the Rings. The suite opens and ends with the Hobbits' leitmotif, which is played by a solo tin whistle, and manifests in a distinct pattern in several, but not all channels of the piece.

Our LAMA (in purple) is the only method to correctly identify 4 occurrences within the leitmotif using a distinctive subset of channels. Other than EMD*, LAMA's occurrences show high pairwise similarity, too.

Installation

The easiest is to use pip to install leitmotif.

a) Install using pip

pip install leitmotif

You can also install the project from source.

b) Build from Source

First, download the repository.

git clone https://github.com/patrickzib/leitmotifs.git

Change into the directory and build the package from source.

pip install .

Usage

The parameters of LAMA are:

  • n_dims : Number of subdimensions to use
  • k_max : The largest expected number of repeats. LAMA will search from to for motif sets
  • motif_length_range

LAMA has a simple OO-API.

ml = LAMA(
    ds_name,     # Name of the dataset
    series,      # Multidimensional time series
    n_dims,      # Number of sub-dimensions to use
    n_jobs,      # number of parallel jobs
)

LAMA has a unique feature to automatically find suitable values for the motif length and set size so, that meaningful Leitmotifs of an input TS can be found without domain knowledge. The methods for determining values for and are based on an analysis of the extent function for different .

Learning the Leitmotif length

To learn the motif length, we may simply call:

ml.fit_motif_length(
    k_max,               # expected number of repeats
    motif_length_range,  # motif length range
    plot,                # Plot the results
    plot_elbows,         # Create an elbow plot 
    plot_motifsets,      # Plot the found motif sets
    plot_best_only       # Plot only the motif sets of the optimal length. Otherwise plot all local optima in lengths
)    

To do variable length motif discovery simply set plot_best_only=False

Learning the number of repeats

To do an elbow plot, and learn the number of repeats of the motif, we may simply call:

ml.fit_k_elbow(
    k_max,                # expected number of repeats
    motif_length,         # motif length to use
    plot_elbows,          # Plot the elbow plot
    plot_motifsets        # Plot the found motif sets
)

Use Cases

Data Sets: We collected and annotated 12 challenging real-life data sets to assess the quality and scalability of the LAMA algorithm.

Ground leitmotifs were manually inferred. GT refers to the number of leitmotif occurrences.
Use Case Category Length Dim. GT
Charleston Motion Capture 506 93 3
Boxing Motion Capture 4840 93 10
Swordplay Motion Capture 2251 93 6
Basketball Motion Capture 721 93 5
LOTR - The Shire Soundtrack 6487 20 4
SW - The Imperial March Soundtrack 8015 20 5
RS - Paint it black Pop Music 9744 20 10
Linkin Park - Numb Pop Music 8018 20 5
Linkin P. - What I've Done Pop Music 8932 20 6
Queen - Under Pressure Pop Music 9305 20 16
Vanilla Ice - Ice Ice Baby Pop Music 11693 20 20
Starling Wildlife Rec. 2839 20 4

Aggregated Results

Method Mean Precision Median Precision Mean Recall Median Recall
EMD* 0.608712 0.65 0.745455 0.8
K-Motifs (TOP-f) 0.651515 0.75 0.751515 1
K-Motifs (all dims) 0.831439 1 0.854545 1
LAMA 0.946591 1 0.990909 1
mSTAMP 0.545455 1 0.342424 0.2
mSTAMP+MDL 0.454545 0 0.251515 0

See all results in Results Notebook.

Notebooks

  • Jupyter-Notebooks for finding subdimensional Leitmotifs in a multidimensional time series Multivariate Use Case: highlights a use case used in the paper, and shows the unique ability to learn its parameters from the data and find interesting motif sets.

  • All other use cases presented in the paper can be found in the test folder

Citation

If you use this work, please cite as:

TODO

About

Discovering Leitmotifs in Multidimensional Time Series

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 94.9%Language:Python 5.1%