kaveh808 / kons-9

Common Lisp 3D Graphics Project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimize OpenGL Drawing

kaveh808 opened this issue · comments

Use vertex arrays and the like to speed up the current naive drawing code in opengl.lisp.

it's been over 5 years since the last release of Opengl, there probably shouldn't be any reason to target anything but the latest. But since the writing is on the wall, perhaps some thought should be put into how to abstract over both opengl and vulkan. Though I'm thinking that might need somebody familiar with vulkan.
If not, then there's choosing an abstraction to handle modern gl or write yet another one.

Regarding the need to keep our eyes on the next graphics API as @JMC-design was talking about: Piet-gpu is a good project to follow. They have a 2d/font focus, but they are pushing the envelope for doing as much of the compute for a UI on the GPU:
https://github.com/linebender/piet-gpu
project vision:
https://github.com/linebender/piet-gpu/blob/main/doc/vision.md
.. And Raph Levien has some fantastic articles about doing graphics/compute on modern GPU/gpu-apis:
https://github.com/linebender/piet-gpu/blob/main/doc/blogs.md

Might also want to set a bar for minimum gpu memory, I guess that's something that needs to be tracked, such a weird concept.
I've run across piet when looking for ideas on a rich text sort of api. I'm not sold on specifying ranges, though it is nice that it allows the text to be unmodified. I'm still leaning towards something I can read or write to a stream, so list of objects and lists that change attributes.

yet another opengl abstraction for lisp
https://github.com/jl2/simple-gl

I am very keen to maximize use of the GPU as well as SIMD and multiple cores. I really want our system to be able to handle production-level datasets with the same (or better) speed as commercial packages.

How we architect this (improved OpenGL interface, Vulkan, compute on GPU) is something we should discuss.

If we do have a Vulkan enthusiast, a first step could be to implement the equivalent of the code in opengl.lisp.

Also, one of my goals is to develop a cross-platform GUI toolkit. Currently we're building it on OpenGL, using the text engine by @awolven and font rasterizer by @JMC-design .

So I've just drawn my first triangle using vertex arrays and here are some of my initial thoughts.
I'm assuming we'd like to fill buffers by just sending a list of points? What I've done for a test is just fill up a cl array, grab the vector-sap, and use that to fill buffers. With points we have to pack them. Do we pack into a cl array, pin and use, or just pack directly into a foreign array, and then free or keep the array around?
Does any packing we do into cl arrays have any effect on packing into simd packs?

Writing glsl in a string in a lisp buffer is a nightmare of formatting. In the long run it doesn't matter what a person uses to get a string for a shader program, but maybe there should be some default shader dsl, or formatting to make code and examples easier to read?

It'd seems like it might be nice to encapsulate these buffers into structs that can be passed around easily, then you have to build a bunch of functions to use those structs, and then years later you have cepl... or something similar. I wonder if anybody has made a comparison of the different layers on top of gl?

I'm not even sure if sbcl system pointers work the same way on windows or osx. So maybe packing directly into foreigns is required? And definitely so if any plans to support another implementation.
If anybody is interested this is the code I used to test. https://plaster.tymoon.eu/view/3408#3408 , just replace the surface:update with whatever your window needs to swap buffers.

I tried, but it reads like c and I don't see any lispy abstraction. The only thing I see is direct writing of individual bytes to foreign memory.
I'm not bright enough to understand other languages.

These are good questions, and there are a lot of moving parts on how we encode geometry: ease of editing in CL, optimized OpenGL display, for SIMD, for threads.

One possibility I have been mulling over is whether we should keep a low-level C representation which can act like an old school display list for our geometry classes. We would need to sync up the CL point arrays with these C-type vectors after modeling operations, which would be optimized for OpenGL and such.

Or we could have C-level structs for internal geometry, which we access and modify from GL. That might make CL editing a bit slower, but could result in faster rendering.

I am very keen to maximize use of the GPU as well as SIMD and multiple cores. I really want our system to be able to handle production-level datasets with the same (or better) speed as commercial packages.

Does it include distributed computing as a goal? :-)

Down the road, why not? :)

Down the road, why not? :)

Because would be a 30MB SBCL runtime per node? I really wish there was something like MirageOS (which uses OCaml) for Common Lisp or Scheme.

good eats

I'm a bit full from their 130 page slide deck on optimization. Looks like OpenGL 4.2+ only, which caused a stomach rumble. Sometimes I wonder, "Why can't we just implement OpenGL in pure Common Lisp and be done with it?"

I think the approach is still interesting. Today I'm going to try and test if it makes any difference packing arrays from different types of points, into cl arrays that are pinned and sent, as well as foreign arrays and sent.
In my brain it doesn't seem like there'd be much difference.
Besides un/packing structured bits to be sent is on my todo list, calling it pipeline. For use with a new CLX and wayland.
the thing with 4.2 is that 4.1 might have the same things just in extensions. Whether it's like that on Mac I don't know. That or maybe MGL isn't hard to install/use? I have no mac to test that.

I think the approach is still interesting.

I agree, especially given the potential performance improvement. (I don't like vinegar on my salad, but wouldn't suggest other people shouldn't enjoy it, if you can tolerate one more food joke.) Thank you for posting the link and doing the testing.

I don't have a (capable enough) Mac to try it out on either, but if you do have success I wonder if it would help for you to post a simplified gist somewhere so someone who does could try it out.

Trying to come up with a good test for display as well. But so far, with just 333,333 points there's no time difference in packing cl arrays from either origin vectors or 3d-vector structs. From vectors uses slightly less cpu, but I probably need more points, since this is all taking ~0.004 seconds. .020 using generic functions.
submitting cl arrays to gl by pinning them and passing the pointer is, well, just passing a pointer. I guess I should probably through in some static-vectors stuff.

so here's just some basic testing. If you make smaller arrays then origin's lead widens. whether it's worth the trade off in not being able to dispatch on...
But the surprising thing is the foreign being slower. If we can depend on just using sbcl to send pointers then i'm not sure what the benefit is.

https://plaster.tymoon.eu/view/3413#3413

Nice work. Is the cost of sending sbcl pointers and ffi arrays to OpenGL (and GPUs) the same?

On a slight tangent, should we bite the bullet and go with double-float as our default? Or is the performance hit a serious one?

i can't see why it would be different as they're both just pointers to memory. Unless being in sbcl's mem space somehow affects it. That's why I think an actual drawing test might elucidate further. at least just in terms of packing/repacking something over and over.

I don't know if I've been reading out dated stuff, but what I've seen is that lots of opengl drivers will just convert to single as their internal format. The support for doubles for gp compute is relatively new and requires above 4.1 and in some cases a new card. I've seen figures of half to 1/3 of performance of singles.
For anything like CAD I'd think a fixedpoint format would probably be better.

Perhaps someone who doesn't necessarily have vulkan experience could volunteer.

I volunteer to make an attempt this month. What do I need to know to start off in the right direction? (Either in absolute terms or based on the tiny start I made in #109 a ways back.)

I'm interested in trying to write this. I will try to build on what @JMC-design has proposed and the text-rendering engine @awolven has written.

It would probably make sense to reuse parts of the code of the text-rendering engine. In order to do so I would have a lot of questions, since there are a lot of things I don't understand the purpose of - it seems like a pretty advanced implementation to me which take a lot of nitty details of OpenGL into consideration, am I right ?

Anyway, I'll start by proposing something and hopefully we can improve on it incremental after with your feedback.

unless you live in a cold cabin and need your PC to double as a toaster oven.

I used to render movies on my Mac Dual G4 only in the winter in Colorado, b/c it used nearly 1500W, like a hair dryer (which would have been quieter).

in the long run one will want to support retained mode paradigms

Retained mode caching in OpenGL-based scene graphs usually used "display lists". What method exists to do that now?

I see. I could also join the effort of porting kons-9 to krma then, if this makes more sense. I'm mostly interested in having a rendering engine I can understand and modify on the fly. If krma can fulfill this role, I'm in.

About the modularity of krma, how would you do things like offscreen rendering, multiple passes? How would you create and load custom pipelines? Having some simple examples would be nice.

Kaveh rejected the vulkan branch and continued to make changes to the main branch until the vulkan branch bit rotted.

I have the feeling anything I say here is going to get me in trouble with someone. Adieu.

Could krma evolve to become something like CEPL for vulkan? Because that's in the end what I am looking for: a CL interface to a graphics API. Not just the bindings of course, but an interface that make programming OpenGL or Vulkan in CL more natural

CEPL

+1

Adieu to this topic.