apitrace / apitrace

Tools for tracing OpenGL, Direct3D, and other graphics APIs

Home Page:https://apitrace.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug recording instanced draw calls with user arrays

zmike opened this issue · comments

Both traces provided in https://gitlab.freedesktop.org/mesa/mesa/-/issues/9119 crash during vao data upload for a draw. It seems like somehow either the vertex data is being truncated (it's awfully suspicious that they all have the same data size?) or the attribute divisor is being lost.

I tried doing my own capture and it's the same. This definitely seems like an apitrace capture bug.

The problem is instancing.

4097470 glEnableVertexAttribArray(index = 0)
4097471 glVertexAttribDivisor(index = 0, divisor = 1)
4097472 glEnableVertexAttribArray(index = 1)
4097473 glVertexAttribDivisor(index = 1, divisor = 1)
4097474 glEnableVertexAttribArray(index = 2)
4097475 glVertexAttribDivisor(index = 2, divisor = 1)
4097476 glEnableVertexAttribArray(index = 3)
4097477 glVertexAttribDivisor(index = 3, divisor = 1)
4097478 glEnableVertexAttribArray(index = 4)
4097479 glVertexAttribDivisor(index = 4, divisor = 0)
4097480 glEnableVertexAttribArray(index = 7)
4097481 glVertexAttribDivisor(index = 7, divisor = 0)
4097482 glEnableVertexAttribArray(index = 8)
4097483 glVertexAttribDivisor(index = 8, divisor = 0)
4097484 glEnableVertexAttribArray(index = 9)
4097485 glVertexAttribDivisor(index = 9, divisor = 1)
4097486 glEnableVertexAttribArray(index = 10)
4097487 glVertexAttribDivisor(index = 10, divisor = 1)
4097488 glEnableVertexAttribArray(index = 11)
4097489 glVertexAttribDivisor(index = 11, divisor = 1)
4097490 glVertexAttribPointer(index = 0, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
4097491 glVertexAttribPointer(index = 1, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
4097492 glVertexAttribPointer(index = 2, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
4097493 glVertexAttribPointer(index = 3, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
4097494 glVertexAttribPointer(index = 4, size = 3, type = GL_FLOAT, normalized = GL_FALSE, stride = 136, pointer = blob(420)) // fake
4097495 glVertexAttribPointer(index = 7, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 16, pointer = blob(64)) // fake
4097496 glVertexAttribPointer(index = 8, size = 2, type = GL_FLOAT, normalized = GL_FALSE, stride = 8, pointer = blob(32)) // fake
4097497 glVertexAttribPointer(index = 9, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
4097498 glVertexAttribPointer(index = 10, size = 2, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(440)) // fake
4097499 glVertexAttribPointer(index = 11, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
4097500 glDrawElementsInstanced(mode = GL_TRIANGLES, count = 6, type = GL_UNSIGNED_INT, indices = blob(24), instancecount = 20)

The index buffer only has indices 0..3. But the instance count is 20, which means all that all attributes which have divisor == 1 will be indexed 20 times, not 4 times.

In short, one needs to modify wrappers/gltrace.py's generated _trace_user_arrays() function to call glGetVertexAttribiv(..., GL_VERTEX_ATTRIB_ARRAY_DIVISOR) and use instancecount/divisor instead of count whenever divisor != 0

Not hard, but messy. Especially because calls to glGetVertexAttribiv(..., GL_VERTEX_ATTRIB_ARRAY_DIVISOR) need to be predicated by the appropriate GL version/extension checks.

We should also have a test for this ( along the lines of https://github.com/apitrace/apitrace-tests/blob/master/apps/gl/varray.cpp ) too.

I don't think I'll have time to look into this in the near/medium term.

Thanks for the analysis. I'm looking into this now: when you say "use instancecount/divisor instead of count whenever divisor != 0" is this just in the // void APIENTRY glVertexAttribPointer(GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const GLvoid * pointer) section? Or I'm not sure exactly which vattrib should be queried for the divisor with the _glGetVertexAttribiv calls for the legacy attribs.

I drafted a fix on 1f857a2

Needs testing and the issue described in FIXME comment addressed before it can be merged into master.

I haven't found a good test case. https://learnopengl.com/Advanced-OpenGL/Instancing / https://github.com/JoeyDeVries/LearnOpenGL/tree/master/src/4.advanced_opengl/10.1.instancing_quads describes the technique but uses VBOs, not user memory arrays.

BTW, the _trace_user_arrays is overly complex. It's code generated by Python, but honestly, it would be easier to deal with it by avoiding the code generaiton and editing C code directly. That would be a nice cleanup.

This looks pretty similar to what I had wipped out locally. For my version check I had:

        print('    bool has_divisors = ((profile.es() && profile.major == 3) ||')
        print('                         (!profile.es() && (profile.major > 3 ||')
        print('                                            (profile.major == 3 && profile.minor >= 3))));')
        print()

which isn't an explicit extension check but seems close enough?

Sorry if I ended up duplicating your work. I thought it was easier and more expressive to quickly sketch a patch than to explain in English. (Like I mention, the Python code generation makes this business way more complicated than it needs to be -- not in general, but in this particular instance.)

For version check it's better to add a ARB_vertex_attrib_binding bit to glfeatures.{cpp,hpp}'s Feature class, set it to 1 when GL version >= 3.2 or GL_ARB_vertex_attrib_binding is present for desktop, and whaever is adequate for ES. Then use this flag from _trace_user_arrays.

This way the logic is concentrated in one place.

No, no, you're the expert, so it's nice to see that I was mostly on the right track.

Extension bit sounds pretty straightforward to me.

Actually, double checking the extension business, I'm not sure GL_ARB_vertex_attrib_binding is the right one...

I think it's actually GL_ARB_instanced_arrays ? Or maybe a combo? Needs another look

Yep, I made a copy and paste error somewhere. Everywhere I wrote ARB_vertex_attrib_binding it should have been ARB_instanced_arrays. This extension was made core in GL 3.3.

I don't know what's the equivalent in ES.

BTW, the baseinstance should be added to instancecount whenever present.

I took a stab at it here https://github.com/zmike/apitrace/commits/userarrays

In testing, however, it doesn't seem to have any effect. I'm wondering why the extension check is useful, however, when these params are only available if the driver supports the required extensions (since they come from the extension functions)?

I took a stab at it here https://github.com/zmike/apitrace/commits/userarrays

LGTM.

In testing, however, it doesn't seem to have any effect.

Pitty. A simple test case for this might make debugging easier.

Can you apitrace dump the relevant calls (like in #892 (comment) )?

I'm wondering why the extension check is useful, however, when these params are only available if the driver supports the required extensions (since they come from the extension functions)?

Right, and that originates the problem: if apitrace unconditionally calls glGetVertexAttribiv(..., GL_VERTEX_ATTRIB_ARRAY_DIVISOR) and the OpenGL doesn't support the extension, then subsequent calls to glGetError() will return != GL_NO_ERROR and it might confuse the application if it calls glGetError(). This is not hypothetical -- some apps definetly call glGetError() and bail out if it returns anything but GL_NO_ERROR. That's what I want to prevent -- have application visible side effects which cause applications changing their behavior while tracing.

Ah true, I momentarily forgot we were doing the GL_VERTEX_ATTRIB_ARRAY_DIVISOR part.

Here's the dump:

2076135 glEnableVertexAttribArray(index = 0)
2076136 glVertexAttribDivisor(index = 0, divisor = 1)
2076137 glEnableVertexAttribArray(index = 1)
2076138 glVertexAttribDivisor(index = 1, divisor = 1)
2076139 glEnableVertexAttribArray(index = 2)
2076140 glVertexAttribDivisor(index = 2, divisor = 1)
2076141 glEnableVertexAttribArray(index = 3)
2076142 glVertexAttribDivisor(index = 3, divisor = 1)
2076143 glEnableVertexAttribArray(index = 4)
2076144 glVertexAttribDivisor(index = 4, divisor = 0)
2076145 glEnableVertexAttribArray(index = 7)
2076146 glVertexAttribDivisor(index = 7, divisor = 0)
2076147 glEnableVertexAttribArray(index = 8)
2076148 glVertexAttribDivisor(index = 8, divisor = 0)
2076149 glEnableVertexAttribArray(index = 9)
2076150 glVertexAttribDivisor(index = 9, divisor = 1)
2076151 glEnableVertexAttribArray(index = 10)
2076152 glVertexAttribDivisor(index = 10, divisor = 1)
2076153 glEnableVertexAttribArray(index = 11)
2076154 glVertexAttribDivisor(index = 11, divisor = 1)
2076155 glVertexAttribPointer(index = 0, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
2076156 glVertexAttribPointer(index = 1, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
2076157 glVertexAttribPointer(index = 2, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
2076158 glVertexAttribPointer(index = 3, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
2076159 glVertexAttribPointer(index = 4, size = 3, type = GL_FLOAT, normalized = GL_FALSE, stride = 136, pointer = blob(420)) // fake
2076160 glVertexAttribPointer(index = 7, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 16, pointer = blob(64)) // fake
2076161 glVertexAttribPointer(index = 8, size = 2, type = GL_FLOAT, normalized = GL_FALSE, stride = 8, pointer = blob(32)) // fake
2076162 glVertexAttribPointer(index = 9, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
2076163 glVertexAttribPointer(index = 10, size = 2, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(440)) // fake
2076164 glVertexAttribPointer(index = 11, size = 4, type = GL_FLOAT, normalized = GL_FALSE, stride = 144, pointer = blob(448)) // fake
2076165 glDrawElementsInstanced(mode = GL_TRIANGLES, count = 6, type = GL_UNSIGNED_INT, indices = blob(24), instancecount = 12)

Thanks for the dump. It looks like the patch didn't do anything: all attributes are assuming count, none is using instancecount.

Two possibilities:

  1. our patch is buggy, and glGetVertexAttribiv(..., GL_VERTEX_ATTRIB_ARRAY_DIVISOR) is never being called
  2. our patch is right, and Mesa's glGetVertexAttribiv(..., GL_VERTEX_ATTRIB_ARRAY_DIVISOR) is giving bogus results.

A carefully added os::log statement, added right after glGetVertexAttribiv(GL_VERTEX_ATTRIB_ARRAY_DIVISOR) should tell which is happening.

Looking again to the dump, there are no fake glVertexAttribDivisor calls emitted, so is as if the was no change at all..!

You're sure you're using the patched glxtrace.so?

Pretty sure? I'll double check. There's a lot of prefixes on my test machine here.

Too many prefixes, apparently. It works great with this branch.

@jrfonseca I put up a MR in case you missed it.