Extract original input points from DelaunaySparse output

Question

Extract original input points from DelaunaySparse output

gillette7 opened this issue 3 years ago · comments

When I pass pts as an input to DelaunaySparse (with the python bindings) it overwrites pts with the shifted and scaled version after the run. I need to look up information from the pre-scaled pts data set after running DelaunaySparse. Is there an easy way to do this, or do I have to make a deep copy of pts before passing it in? Thanks.

Tyler H Chang · Answer 1 · Tue Feb 02 2021 03:23:38 GMT+0800 (China Standard Time)

This is the expected behavior; we do not make any unnecessary copies of your data since this may result in unwanted overhead. If you want to keep an unmodified copy, you should make a copy and then pass the copy to Delaunaysparse.

What you are observing is one of the fundamental differences between Python (pass by copy) vs Fortran (pass by reference) programming paradigms :) Either way, a copy would be getting made, but the question is whether we should automatically make the copy for you behind the scenes and hide the details (even when you don't want a copy), or whether we should give you the option of doing it yourself. We chose the latter. I believe that whenever you use numpy arrays, they actually also pass by reference, which is partly why numpy tends to be faster than native Python

Andrew Gillette · Answer 2 · Tue Feb 02 2021 05:08:41 GMT+0800 (China Standard Time)

Makes sense. Would it be possible to (optionally) output the scale and shift factors? I think if I have data in R^n that this would be something like an n x n matrix and a length n vector, which is probably much less memory than making a copy of all the input data points. I'm by no means an expert on memory cost / management and I don't need this urgently - just wondering if this is possible down the road.

Thomas Lux · Answer 3 · Tue Feb 02 2021 05:25:52 GMT+0800 (China Standard Time)

The delsparse code uses a RESCALE subroutine to normalize your data. It shifts all points by the center (average of all points) and scales all points in by the maximum distance to the center, making the furthest distance from the origin 1 (in the 2-norm).

You could easily compute these transformations yourself:

import numpy as np
# generate 100 random points in R^3 (row major storage)
pts = np.random.random(size=(100,3))
# compute the shift and scale as done by delsparse
shift = np.mean(pts, axis=0)
scale = np.max(np.linalg.norm(pts-shift, axis=1), axis=0)

Or you could reverse engineer the shift and scale factors by storing a pair of unique points before calling the Delaunay code. I'd have to work out the math to see how to do that, but it would be a lot more computationally efficient.

Tyler H Chang · Answer 4 · Tue Feb 02 2021 05:28:20 GMT+0800 (China Standard Time)

It is actually just a single length n vector and a scale factor r. It is not hard to return this value, but it would take me some time to adjust the interfaces and Python wrapper... For now, you could compute them yourself before passing as follows:

First compute the barycenter (center of gravity/average position) of pts. Call this vector c (\in R^d).

Next compute the maximum distance from each of your points to c: r = max_{p \in pts} ||p - c||_2.

The transformation that I applied was pts' = {(p - c) / r | p \in pts} (barycenter is zero, with unit radius)

So, after calling DelaunaySparse, you could undo the transformation using:

pts = {p' * r + c : p' \in pts'}

Andrew Gillette · Answer 5 · Tue Feb 02 2021 05:29:12 GMT+0800 (China Standard Time)

Perfect. Thanks to both of you for saving me the hassle of working this out on my own :)