xinychen / tracebase

Nonstationary temporal matrix factorization for sparse traffic time series forecasting.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to define temporal operators as sparse arrays in both `numpy` (for CPU) and `cupy` (for GPU)?

xinychen opened this issue · comments

It is possible to define temporal operators as sparse arrays, and doing this is extremely significant in the case that the temporal dimensionality is very large, e.g., T = 100,000.

  • We can first check out the original code for defining temporal operators.
import numpy as np

def generate_Psi(T, d, season):
    Psi = []
    for k in range(0, d + 1):
        if k == 0:
            Psi.append(np.append(np.zeros((T - d - season, d)), 
                                 np.append(-1 * np.eye(T - d - season), np.zeros((T - d - season, season)), axis = 1) 
                                 + np.append(np.zeros((T - d - season, season)), np.eye(T - d - season), axis = 1), axis = 1))
        else:
            Psi.append(np.append(np.append(np.zeros((T - d - season, d - k)), 
                                           np.append(-1 * np.eye(T - d - season), np.zeros((T - d - season, season)), axis = 1)
                                           + np.append(np.zeros((T - d - season, season)), np.eye(T - d - season), axis = 1), axis = 1), 
                                 np.zeros((T - d - season, k)), axis = 1))
    return Psi
  • For CPU implementation, we can use numpy and scipy.
import numpy as np
from scipy.sparse import coo_matrix

def generate_Psi(T, d, season):
    Psi = []
    for k in range(0, d + 1):
        if k == 0:
            Psi.append(coo_matrix((- np.ones(T - d - season), 
                                   (np.arange(0, T - d - season), np.arange(d, T - season))), 
                                   shape = (T - d - season, T)).tocsr() 
                       + coo_matrix((np.ones(T - d - season), 
                                    (np.arange(0, T - d - season), np.arange(d + season, T))), 
                                    shape = (T - d - season, T)).tocsr())
        else:
            Psi.append(coo_matrix((- np.ones(T - d - season), 
                                   (np.arange(0, T - d - season), np.arange(d - k, T - k - season))), 
                                   shape = (T - d - season, T)).tocsr() 
                       + coo_matrix((np.ones(T - d - season), 
                                     (np.arange(0, T - d - season), np.arange(d - k + season, T - k))), 
                                     shape = (T - d - season, T)).tocsr())
    return Psi
  • For GPU implementation, we can use cupy.
import cupy as np
from cupy.sparse import coo_matrix

def generate_Psi(T, d, season):
    Psi = []
    for k in range(0, d + 1):
        if k == 0:
            Psi.append(coo_matrix((- np.ones(T - d - season), 
                                   (np.arange(0, T - d - season), np.arange(d, T - season))), 
                                   shape = (T - d - season, T)).tocsr() 
                       + coo_matrix((np.ones(T - d - season), 
                                    (np.arange(0, T - d - season), np.arange(d + season, T))), 
                                    shape = (T - d - season, T)).tocsr())
        else:
            Psi.append(coo_matrix((- np.ones(T - d - season), 
                                   (np.arange(0, T - d - season), np.arange(d - k, T - k - season))), 
                                   shape = (T - d - season, T)).tocsr() 
                       + coo_matrix((np.ones(T - d - season), 
                                     (np.arange(0, T - d - season), np.arange(d - k + season, T - k))), 
                                     shape = (T - d - season, T)).tocsr())
    return Psi

Note that this is not an issue, but this is a really interesting question for improving your Python implementation.

In terms of TMF, the temporal operators can also be defined as sparse matrices. Please check out

import numpy as np
from scipy.sparse import coo_matrix

def generate_Psi(T, d):
    Psi = []
    for k in range(0, d + 1):
        if k == 0:
            Psi.append(coo_matrix((np.ones(T - d), 
                                  (np.arange(0, T - d), np.arange(d, T))), 
                                  shape = (T - d, T)).tocsr())
        else:
            Psi.append(coo_matrix((np.ones(T - d), 
                                  (np.arange(0, T - d), np.arange(d - k, T - k))), 
                                  shape = (T - d, T)).tocsr())
    return Psi