Fast, n-dimensional linear interpolation and extrapolation on sparse grids.
Ndpolator is a combined interpolator/extrapolator that operates on sparse (incompletely populated)
Ndpolator sources are hosted on pypi; you can install the latest release by issuing:
pip install ndpolator
To install ndpolator from github, clone the repo and install it from the local directory with pip:
$> git clone https://github.com/aprsa/ndpolator ndpolator
$> cd ndpolator
$> pip install .
Once installed, you can test the installation by running a pytest:
$> cd tests
$> pytest
API reference is available on gh-pages.
To demonstrate the usage of ndpolator, let us consider a 3-dimensional space with three axes of vastly different vertex magnitudes. For comparison purposes, let the function that we want to interpolate and extrapolate be a linear scalar field:
A suitable ndpolator instance would be initiated and operated as follows:
import numpy
import ndpolator
# initialize the axes:
a1 = np.linspace(1000, 5000, 5)
a2 = np.linspace(1, 5, 5)
a3 = np.linspace(0.01, 0.05, 5)
# initialize interpolation space:
ndp = ndpolator.Ndpolator(basic_axes=(a1, a2, a3))
# define a scalar function field and evaluate it across the grid:
def fv(pt):
return pt[0]/1000 + pt[1] + 100*pt[2]
grid = np.empty((len(ax1), len(ax2), len(ax3), 1))
for i, x in enumerate(ax1):
for j, y in enumerate(ax2):
for k, z in enumerate(ax3):
grid[i, j, k, 0] = fv((x, y, z))
# label the grid ('main') and register it with the ndpolator instance:
ndp.register(table='main', associated_axes=None, grid=grid)
# draw query points randomly within and beyond the definition ranges:
query_pts = np.ascontiguousarray(
np.vstack((
np.random.uniform(500, 5500, 1000),
np.random.uniform(0.5, 5.5, 1000),
np.random.uniform(0.005, 0.055, 1000))
).T
)
# interpolate and extrapolate linearly:
interps = ndp.ndpolate(table='main', query_pts, extrapolation_method='nearest')
Multi-variate (scipy.interpolate
) that implements several multi-variate interpolation classes, including piecewise-linear, nearest neighbor, and radial basis function interpolators. Unfortunately, none of the implemented scipy methods lend themselves readily to extrapolation: at most they can fill the values off the convex hull with nan
s or a value supplied by the user. In addition, interpolators that operate on a regular
Ndpolator aims to fill this gap: it can both interpolate and extrapolate function values within and beyond the grid definition range, and it can operate on incomplete grids. As a side benefit, ndpolator can estimate both scalar and vector function values, and it can reduce grid dimensionality for points of interest that lie on grid axes. It is optimized for speed and portability (the backend is written in C), and it also features a python wrapper. Ndpolator was initially developed for the purposes of the eclipsing binary star modeling code PHOEBE (Prša et al. 2016), to allow the interpolation and extrapolation of specific intensities in stellar atmospheres. Yet given the gap in the multi-variate interpolation and extrapolation landscape, ndpolator development has been separated from PHOEBE and made available to the community as a standalone package.
Consider a scalar or a vector field
The first, most fundamental principle of ndpolator is that all interpolation and extrapolation is done on unit hypercubes. In real-world applications, it is seldomly true that all axes are defined on a unit interval. This can lead to vertices of significantly different orders of magnitude along individual axes. To that end, ndpolator first normalizes the hypercubes by transforming them to unit hypercubes: given the sets of two consecutive axis values that span a hypercube,
The second operating principle of ndpolator is sequential dimensionality reduction. Consider a 3-dimensional hypercube in \autoref{fig:interpolation}; let us assume that function values in all 8 corners of the hypercube are sampled, i.e. we have 8 nodes. The point of interest is depicted with an open symbol in the left panel, along with projections onto the hypercube faces. Ndpolator starts with the last axis, in this case
The third operating principle of ndpolator is initial dimensionality reduction. In real-life applications it frequently happens that some of query point coordinates are aligned with the axes. For example, one of the axes might allow the variation of the second order variable, but its value usually defaults to the value that is sampled across the grid. When this happens, the initial hypercube dimension can be reduced by 1 for each aligned axis. The extreme case where the query point coincides with a node means that hypercube dimensionality is reduced to 0, and there is no need for interpolation. For that reason, ndpolator flags each coordinate of the query point as "on-grid", "on-vertex", or "out-of-bounds". When "on-vertex," hypercube dimension can be immediately reduced. When that happens, the time dependence is reduced to
The fourth operating principle of ndpolator is dealing with incomplete hypercubes. If any of the hypercube corners are voids, we cannot interpolate. For that purpose, ndpolator keeps track of all fully defined
The fifth operating principle of ndpolator is extrapolation. Ndpolator has three extrapolation methods: none
, nearest
and linear
. When extrapolation method is set to none
, the function value that is outside the range of axes is set to nan
. For extrapolation method nearest
, ndpolator stores a list of all nodes and assigns a function value in the node that is nearest to the query point. Lastly, if extrapolation method is set to linear
, ndpolator linearly extrapolates from the nearest fully defined hypercube in a manner equivalent to dealing with incomplete hypercubes. The choice for extrapolation method depends on the multi-variate function that we are estimating; if it is highly non-linear, extrapolation should be avoided, so none
and nearest
might be appropriate; if it is largely linear or varies slowly, then a linear
extrapolation method might be warranted. Ndpolator is a linear extrapolator, so it cannot adequately estimate non-linear multi-variate functions.
The question of grid completeness is quite impactful for performance; that is why the sixth operating principle of ndpolator is to distinguish between basic axes and associated axes. Axes that can have voids in their cartesian products are referred to as basic. For these axes, we need full ndpolator machinery to perform interpolation and extrapolation. On the other hand, a subset of axes may have all nodes in their cartesian products, i.e. they are guaranteed to be sampled in all vertices that basic axes are sampled in; these are referred to as associated axes. Given that their sampling is ascertained, interpolation and extrapolation can proceed without concerns for incomplete hypercubes -- that is, for as long as their basic hypercube counterparts (hypercubes spun by basic axes) are complete. Each associated axis reduces the dimensionality of the hypercubes that need to be stored for extrapolation lookup, thus optimizing performance further.
The seventh and final operating principle concerns function value dimensionality. Most interpolators assume that the function value
While not explicitly a part of ndpolator's operating principles, ndpolator exposes two auxiliary functions, import_query_pts()
and find_hypercubes()
, that can be used to cache hypercubes. That way, a calling program can group query points that are enclosed by a single hypercube and perform bulk interpolation without the need to find the corresponding hypercube for each query point successively. While the indexing and the hypercube search are both binary, avoiding the lookup when possible further optimizes the runtime.
Ndpolator is released under the GNU General Public License. The Application Programming Interface (API) is available for the underlying C library on gh-pages. The test suite and automated API building are incorporated into github's Continuous Integration (CI) infrastructure. Any and all feedback, particularly issue reporting and pull requests, are most welcome.
Financial support for this project by the National Science Foundation, grant #2306996, is gratefully acknowledged.