immersive-web / depth-sensing

Specification: https://immersive-web.github.io/depth-sensing/ Explainer: https://github.com/immersive-web/depth-sensing/blob/main/explainer.md

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Millimeters vs Meters

Maksims opened this issue · comments

What is the reasoning behind storing data as millimetres instead of meters?
WebXR APIs uses 1 unit = 1 meter, as well as most of popular WebGL engines. Although this API uses 1 unit = 1 millimetre.

Raw data that we are obtaining from the underlying framework (ARCore) is in millimeters, but it is worth mentioning that XRDepthInformation.getDepth(x,y) returns data in meters. Requiring XRDepthInformation.data to contain data in meters would place a bit of a burden on the implementation to perform the conversion, which I wanted to avoid here.

If it turns out that it is too restrictive to assume that data is in some specified unit (for example because other frameworks natively return floats w/ distance in meters instead of uints w/ distance in millimeters), we will likely need to expose the type of the underlying data and the multiplier that need to be used to convert the value to meters.

Not sure if uints for millimetres is ever been used in any (incubator / draft / release) WebXR standards.
And WebGL engines definitely do not use ints for units of spaces.

Worth mentioning that consistency even within one API is important, as it can confuse developers as they will have to learn specifics instead of applying their intuitive assumption based on previous most common experience (1 unit = 1 meter).

Worth mentioning that consistency even within one API is important, as it can confuse developers as they will have to learn specifics instead of applying their intuitive assumption based on previous most common experience (1 unit = 1 meter).

I agree with the sentiment here (that is exactly why the getDepth() method returns distance in meters), but unfortunately I think it will be tricky to balance the risk of developer confusion vs performance implications of requiring user agents to convert the data if the underlying platform does not return the data type we need, in units we need. We can try to mitigate the risk of confusing developers (detailed spec text, example code, libraries that abstract the low-level concepts like this are all options), but creating an API that is impossible to be implemented in a performant way is more difficult to fix later, so we need to be extra careful here.

One more point here is that it looks like ARKit's data is returned in meters, & the docs seem to imply it'll have a float32 data type. Since both the underlying data type (unsigned integers vs floats) and the interpretation of values (millimeters vs meters) are different across those 2 platforms, it seems to me that the easiest way to surface this information to the apps for GPU consumption in an efficient way is to return it via a WebGLTexture (so the apps won't need to upload the data to GPU themselves). In addition to that, we will have to provide a scaling factor to convert from whatever units are returned into meters (e.g. s=0.001 for conversion from value in mm into m), and something that informs the apps about the shader they should use when accessing the data (e.g. in current impl. they need to access color and alpha components & unpack them into a single value, but different code may be necessary for different packing schemes). Note that this is just me spitballing some ideas, there may be a better approach or something that I'm missing here.

Also, if the main use case of the depth data is occlusion, it may be better to use more privacy-preserving API once it's available (it is currently in early stages of incubation).

Very reasonable thinking.

From API user point of view, holy grail would be WebGLTexture, and single code path for unpacking, with single units (no variation between platforms). It is worth considering with more platforms available in the future, more variation will come, so more various conversions will have to be required by user - this would be a bad API. So it should be shifted to the browser implementers.

So regardless of path taken, conversion will have to be implemented either on side of API user or browser. Write once > use everywhere - probably will require that browsers will have to provide texture in specific format to the API user. That would be the best.

/agenda to discuss the best way of surfacing the information about packing scheme to the apps (or, if possible, to come up with an API shape that will not need it)

Where unit ambiguity is an issue, it's often better to have a slightly more verbose api that resolves the ambiguity inline, e.g. getDepthMillis.

Closing, this should be addressed by the latest explainer - we expose getDepthInMeters() method, as well as rawValueToMeters attribute that should be used to convert the values into meters when accessing the depth buffer data directly.