fumitoh / modelx

Use Python like a spreadsheet!

Home Page:https://modelx.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Numba/jit support

alexeybaran opened this issue · comments

It seems like modelx doesn't support Numba/jit.

  1. I created function using jit in a separate module
  2. Imported the module to modelx model using new_module function
  3. The function works, but it shows up as regular function rather than CPUDispatcher object from numba.

I've always wanted to test modelx with Numba but haven't gotten around to it due to time constraints. Do you have a sample script that can reproduce the issue?

By the way, the recent DataFrame.apply now supports the engine="numba" option, which might also work in modelx formulas. You can find more information in this article: Unlocking C-level Performance in DataFrame.apply.

The sample code below actually works. Sorry for confusion.

import modelx as mx
from time import time
m,s=mx.new_model(), mx.new_space()
m.new_module('fin', 'fin.py', 'fin.py')
m.new_module('fin_jit', 'fin_jit.py', 'fin_jit.py')
@mx.defcells
def a():
    import numpy as np
    for t in range(10000):
        fin.irr(np.ones(1000), 500)
    return 0

@mx.defcells
def a_jit():
    import numpy as np
    for t in range(10000):
        return fin_jit.irr(np.ones(1000), 500)
    return 0

t0 = time()
a()
t1 = time()
a_jit()


t2 = time()
print('no jit:', t1-t0,'jit: ',t2-t1)

fin.zip

The tradeoff seems to be the time it takes to compile jit function on demand. It takes around 1 second for the function above.

Thanks, stil a_jit is considerable faster than a: no jit: 4.631266117095947 jit: 0.7599091529846191

The relationship reverses, if the function is called 1000 times instead of 10000:
no jit: 0.672553539276123 jit: 1.3341047763824463

Yep. In other words, jit takes no time other than compiling.

Size No JIT JIT
10000 4.631266117095947 0.7599091529846191
1000 0.4715766906738281 0.7797486782073975
100 0.046034812927246094 0.7822494506835938

It seems that it is possible to avoid every time compilation by caching. I'm not sure, caching will work, if I want to load a few modelx models with potentially different functions with the same name.

@jit(cache=True)
def df_simple(rate, shape):
    return (np.ones(shape) * (1.0 + rate) ** -1).cumprod()

@jit(cache=True)
def irr(cf, target, guess=0.01, tolerance=1e-10, max_iter=100):
    _shock = 0.0001
    n = cf.shape

A new module is created every time new_module is called. In the case above, fin_jit is created per model if you create multiple models, i.e. the compilation time would increase with the number of models.