fendevel / Guide-to-Modern-OpenGL-Functions

A guide to using modern OpenGL functions.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A Guide to Modern OpenGL Functions

Index

What this is:

  • A guide on how to apply modern OpenGL functionality.

What this is not:

  • A guide on modern OpenGL rendering techniques.

When I say modern I'm talking DSA modern, not VAO modern, because that's old modern or "middle" GL (however I will be covering some from it), I can't tell you what minimal version you need to make use of DSA because it's not clear at all but you can check if you support it yourself with something like glew's glewIsSupported("ARB_direct_state_access") or checking your API version.

DSA (Direct State Access)

With DSA we, in theory, can keep our bind count outside of drawing operations at zero. Great right? Sure, but if you were to research how to use all the new DSA functions you'd have a hard time finding anywhere where it's all explained, which is what this guide is all about.

DSA Naming Convention

The wiki page does a fine job comparing the DSA naming convention to the traditional one so I stole their table:

OpenGL Object Type Context Object Name DSA Object Name
Texture Object Tex Texture
Framebuffer Object Framebuffer NamedFramebuffer
Buffer Object Buffer NamedBuffer
Transform Feedback Object TransformFeedback TransformFeedback
Vertex Array Object N/A VertexArray
Sampler Object N/A Sampler
Query Object N/A Query
Program Object N/A Program

glTexture


  • The texture related calls aren't hard to figure out so let's jump right in.
glCreateTextures
void glCreateTextures(GLenum target, GLsizei n, GLuint *textures);

Unlike glGenTextures glCreateTextures will create the handle and initialize the object which is why the field GLenum target is listed as the internal initialization depends on knowing the type.

So this:

glGenTextures(1, &name);
glBindTexture(GL_TEXTURE_2D, name);

DSA-ified becomes:

glCreateTextures(GL_TEXTURE_2D, 1, &name);
glTextureParameter
void glTextureParameteri(GLuint texture, GLenum pname, GLenum param);

There isn't much to say about this family of functions; they're used exactly the same but take in the texture name rather than the texture target.

glTextureParameteri(name, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTextureStorage

The glTextureStorage and glTextureSubImage families are the same exact way.

Time for the big comparison:

glGenTextures(1, &name);
glBindTexture(GL_TEXTURE_2D, name);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE );
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE );
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA8, width, height);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, width, height, GL_RGBA, GL_UNSIGNED_BYTE, pixels);
glCreateTextures(GL_TEXTURE_2D, 1, &name);

glTextureParameteri(name, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE );
glTextureParameteri(name, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE );
glTextureParameteri(name, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTextureParameteri(name, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

glTextureStorage2D(name, 1, GL_RGBA8, width, height);
glTextureSubImage2D(name, 0, 0, 0, width, height, GL_RGBA, GL_UNSIGNED_BYTE, pixels);
glBindTextureUnit

Defeats the need for:

glActiveTexture(GL_TEXTURE0 + 3);
glBindTexture(GL_TEXTURE_2D, name);

And replaces it with a simple:

glBindTextureUnit(3, name);
Generating Mip Maps

Takes in the texture name instead of the texture target.

void glGenerateTextureMipmap(GLuint texture);
Uploading Cube Maps

I should briefly point out that in order to upload cube map textures you need to use glTextureSubImage3D.

glTextureStorage2D(name, 1, GL_RGBA8, bitmap.width, bitmap.height);

for (size_t face = 0; face < 6; ++face)
{
	auto const& bitmap = bitmaps[face];
	glTextureSubImage3D(name, 0, 0, 0, face, bitmap.width, bitmap.height, 1, bitmap.format, GL_UNSIGNED_BYTE, bitmap.pixels);
}
Where is glTextureImage?

If you look at the OpenGL function listing you will see a lack of glTextureImage and here's why:

Having to build up a valid texture object piecewise as you would with glTexImage left plenty of room for mistakes and required the driver to do validation as late as possible, and so when it came time to specify a new set of texture functions they took the opportunity to address this rough spot. The result was glTexStorage.

Storage provides a way to create complete textures with checks done on-call, which means less room for error, it solves most if not all problems brought on by mutable textures.

tl;dr "Immutable textures are a more robust approach to handle textures"

  • However be mindful as allocating immutable textures requires physical video memory to be available upfront rather than having the driver deal with when and where the data goes, this means it's very possible to unintentionally exceed your card's capacity.

Sources: ARB_texture_storage, "What does glTexStorage do?", "What's the DSA version of glTexImage2D?"

glFramebuffer


glCreateFramebuffers

glCreateFramebuffers is used exactly the same but initializes the object for you.

Everything else is pretty much the same but takes in the framebuffer handle instead of the target.

glCreateFramebuffers(1, &fbo);

glNamedFramebufferTexture(fbo, GL_COLOR_ATTACHMENT0, tex, 0);
glNamedFramebufferTexture(fbo, GL_DEPTH_ATTACHMENT, depthTex, 0);

if(glCheckNamedFramebufferStatus(fbo, GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE)
	std::cerr << "framebuffer error\n";
glBlitNamedFramebuffer

The difference here is that we no longer need to bind the two framebuffers and specify which is which through the GL_READ_FRAMEBUFFER and GL_WRITE_FRAMEBUFFER enums.

glBindFramebuffer(GL_READ_FRAMEBUFFER, fbo_src);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fbo_dst);

glBlitFramebuffer(src_x, src_y, src_w, src_h, dst_x, dst_y, dst_w, dst_h, GL_COLOR_BUFFER_BIT, GL_LINEAR);

Becomes

glBlitNamedFramebuffer(fbo_src, fbo_dst, src_x, src_y, src_w, src_h, dst_x, dst_y, dst_w, dst_h, GL_COLOR_BUFFER_BIT, GL_LINEAR);
glClearNamedFramebuffer

There are two ways to go about clearing a framebuffer:

The more familar way

glBindFramebuffer(GL_FRAMEBUFFER, fb);
glClearColor(r, g, b, a);
glClearDepth(d);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

and the more versatile per-attachment way

glBindFramebuffer(GL_FRAMEBUFFER, fb);
glClearBufferfv(GL_COLOR, col_buff_index, &rgba);
glClearBufferfv(GL_DEPTH, 0, &d);

col_buff_index is the attachment index, so it would be equivalent to GL_DRAW_BUFFER0 + col_buff_index, and the draw buffer index for depth is always 0.

As you can see with glClearBuffer we can clear the texels of any attachment to some value, both methods are similar enough that you could reimplement the functions of method 1 using those of method 2.

Despite the name it has nothing to do with buffer objects and this gets cleared up with the DSA version: glClearNamedFramebuffer

So the DSA version looks like this:

glClearNamedFramebufferfv(fb, GL_COLOR, col_buff_index, &rgba);
glClearNamedFramebufferfv(fb, GL_DEPTH, 0, &d);

fb can be 0 if you're clearing the default framebuffer.

glBuffer


None of the DSA glBuffer functions ask for the buffer target and is only required to be specified whilst drawing.

glCreateBuffers

glCreateBuffers is used exactly like its traditional equivalent and automatically initializes the object.

glNamedBufferData

glNamedBufferData is just like glBufferData but instead of requiring the buffer target it takes in the buffer handle itself.

glVertexAttribFormat & glBindVertexBuffer

If you aren't familiar with the application of glVertexAttribPointer it is used like so:

struct vertex { vec3 pos, nrm; vec2 tex; };

glBindVertexArray(vao);
glBindBuffer(GL_ARRAY_BUFFER, vbo);

glEnableVertexAttribArray(0);
glEnableVertexAttribArray(1);
glEnableVertexAttribArray(2);

glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, sizeof(vertex), (void*)(offsetof(vertex, pos));
glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, sizeof(vertex), (void*)(offsetof(vertex, nrm));
glVertexAttribPointer(2, 2, GL_FLOAT, GL_FALSE, sizeof(vertex), (void*)(offsetof(vertex, tex));

glVertexAttribFormat isn't much different, the main thing with it is that it's one out of a two-parter with glVertexAttribBinding.

In order to get out the same effect as the previous snippet we first need to make a call to glBindVertexBuffer. Despite Bind being in the name it isn't the same kind as for instance: glBindTexture.

Here's how they're both put into action:

struct vertex { vec3 pos, nrm; vec2 tex; };

glBindVertexArray(vao);
glBindVertexBuffer(0, vbo, 0, sizeof(vertex));

glEnableVertexAttribArray(0);
glEnableVertexAttribArray(1);
glEnableVertexAttribArray(2);

glVertexAttribFormat(0, 3, GL_FLOAT, GL_FALSE, offsetof(vertex, pos));
glVertexAttribFormat(1, 3, GL_FLOAT, GL_FALSE, offsetof(vertex, nrm));
glVertexAttribFormat(2, 2, GL_FLOAT, GL_FALSE, offsetof(vertex, tex));

glVertexAttribBinding(0, 0);
glVertexAttribBinding(1, 0);
glVertexAttribBinding(2, 0);

Although this is the newer way of going about it this isn't fully DSA as we still need to make that VAO bind call, to go all the way we need to transform glEnableVertexAttribArray, glVertexAttribFormat, glVertexAttribBinding, and glBindVertexBuffer into glEnableVertexArrayAttrib, glVertexArrayAttribFormat, glVertexArrayAttribBinding, and glVertexArrayVertexBuffer.

glVertexArrayVertexBuffer(vao, 0, data->vbo, 0, sizeof(vertex));

glEnableVertexArrayAttrib(vao, 0);
glEnableVertexArrayAttrib(vao, 1);
glEnableVertexArrayAttrib(vao, 2);

glVertexArrayAttribFormat(vao, 0, 3, GL_FLOAT, GL_FALSE, offsetof(vertex, pos));
glVertexArrayAttribFormat(vao, 1, 3, GL_FLOAT, GL_FALSE, offsetof(vertex, nrm));
glVertexArrayAttribFormat(vao, 2, 2, GL_FLOAT, GL_FALSE, offsetof(vertex, tex));

glVertexArrayAttribBinding(vao, 0, 0);
glVertexArrayAttribBinding(vao, 1, 0);
glVertexArrayAttribBinding(vao, 2, 0);

The version that takes in the VAO for binding the VBO, glVertexArrayVertexBuffer, has an equivalent for the IBO: glVertexArrayElementBuffer.

All together this is how uploading an indexed model with only DSA should look:

glCreateBuffers(1, &vbo);	
glNamedBufferStorage(vbo, sizeof(vertex)*vertex_count, vertices, GL_DYNAMIC_STORAGE_BIT);

glCreateBuffers(1, &ibo);
glNamedBufferStorage(ibo, sizeof(uint32_t)*index_count, indices, GL_DYNAMIC_STORAGE_BIT);

glCreateVertexArrays(1, &vao);

glVertexArrayVertexBuffer(vao, 0, vbo, 0, sizeof(vertex));
glVertexArrayElementBuffer(vao, ibo);

glEnableVertexArrayAttrib(vao, 0);
glEnableVertexArrayAttrib(vao, 1);
glEnableVertexArrayAttrib(vao, 2);

glVertexArrayAttribFormat(vao, 0, 3, GL_FLOAT, GL_FALSE, offsetof(vertex, pos));
glVertexArrayAttribFormat(vao, 1, 3, GL_FLOAT, GL_FALSE, offsetof(vertex, nrm));
glVertexArrayAttribFormat(vao, 2, 2, GL_FLOAT, GL_FALSE, offsetof(vertex, tex));

glVertexArrayAttribBinding(vao, 0, 0);
glVertexArrayAttribBinding(vao, 1, 0);
glVertexArrayAttribBinding(vao, 2, 0);

Detailed Messages with Debug Output

KHR_debug has been in core since version 4.3 and it's a big step up from how we used to do error polling.

With Debug Output we can receive meaningful messages on the state of the GL through a callback function that we'll be providing.

All it takes to get running are two calls: glEnable & glDebugMessageCallback.

glEnable(GL_DEBUG_OUTPUT);
glDebugMessageCallback(message_callback, nullptr);

Your callback must have the signature void callback(GLenum src, GLenum type, GLuint id, GLenum severity, GLsizei length, GLchar const* msg, void const* user_param).

Here's how I have mine defined:

void message_callback(GLenum source, GLenum type, GLuint id, GLenum severity, GLsizei length, GLchar const* message, void const* user_param)
{
	auto const src_str = [source]() {
		switch (source)
		{
		case GL_DEBUG_SOURCE_API: return "API";
		case GL_DEBUG_SOURCE_WINDOW_SYSTEM: return "WINDOW SYSTEM";
		case GL_DEBUG_SOURCE_SHADER_COMPILER: return "SHADER COMPILER";
		case GL_DEBUG_SOURCE_THIRD_PARTY: return "THIRD PARTY";
		case GL_DEBUG_SOURCE_APPLICATION: return "APPLICATION";
		case GL_DEBUG_SOURCE_OTHER: return "OTHER";
		}
	}();

	auto const type_str = [type]() {
		switch (type)
		{
		case GL_DEBUG_TYPE_ERROR: return "ERROR";
		case GL_DEBUG_TYPE_DEPRECATED_BEHAVIOR: return "DEPRECATED_BEHAVIOR";
		case GL_DEBUG_TYPE_UNDEFINED_BEHAVIOR: return "UNDEFINED_BEHAVIOR";
		case GL_DEBUG_TYPE_PORTABILITY: return "PORTABILITY";
		case GL_DEBUG_TYPE_PERFORMANCE: return "PERFORMANCE";
		case GL_DEBUG_TYPE_MARKER: return "MARKER";
		case GL_DEBUG_TYPE_OTHER: return "OTHER";
		}
	}();

	auto const severity_str = [severity]() {
		switch (severity) {
		case GL_DEBUG_SEVERITY_NOTIFICATION: return "NOTIFICATION";
		case GL_DEBUG_SEVERITY_LOW: return "LOW";
		case GL_DEBUG_SEVERITY_MEDIUM: return "MEDIUM";
		case GL_DEBUG_SEVERITY_HIGH: return "HIGH";
		}
	}();
	std::cout << src_str << ", " << type_str << ", " << severity_str << ", " << id << ": " << message << '\n';
}

There will be times when you want to filter your messages, maybe you're interested in anything but notifications. OpenGL has a function for this: glDebugMessageControl.

Here's how we use it to disable notifications:

glDebugMessageControl(GL_DONT_CARE, GL_DONT_CARE, GL_DEBUG_SEVERITY_NOTIFICATION, 0, nullptr, GL_FALSE);

Something we can do is have messages fire synchronously where it will call on the same thread as the context and from within the OpenGL call. This way we can guarantee function call order and this means if we were to add a breakpoint into the definition of our callback we could traverse the call stack and locate the origin of the error.

All it takes is another call to glEnable with the value of GL_DEBUG_OUTPUT_SYNCHRONOUS, so you end up with this:

glEnable(GL_DEBUG_OUTPUT);
glEnable(GL_DEBUG_OUTPUT_SYNCHRONOUS);
glDebugMessageCallback(message_callback, nullptr);

Farewell, glGetError.

Storing Index and Vertex Data Under Single Buffer

Most material on OpenGL that touch on indexed drawing will separate vertex and index data between buffers, this is because the vertex_buffer_object spec strongly recommends to do so. The reasoning for this is that different GL implementations may have different memory type requirements, so having the index data in its own buffer allows the driver to decide the optimal storage strategy.

This was useful when there were were several ways to attach GPUs to the main system, technically there still are, but AGP was completely phased out by PCIe about a decade ago and regular PCI ports aren't really used for this anymore save a few cases.

The overhead that comes with managing an additional buffer for indexed geometry isn't as justifiable a trade off as it used to be.

We store the vertices and indices at known byte offsets and pass the information to OpenGL. For the vertices we do this with glVertexArrayVertexBuffer's offset parameter, and indices through glDrawElements's indices parameter.

GLint alignment = GL_NONE;
glGetIntegerv(GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, &alignment);

GLuint vao 	= GL_NONE;
GLuint buffer 	= GL_NONE;

auto const ind_len = GLsizei(ind_count * sizeof(element_t));
auto const vrt_len = GLsizei(vrt_count * sizeof(vertex));

auto const ind_len_aligned = align(ind_len, alignment);
auto const vrt_len_aligned = align(vrt_len, alignment);

auto const ind_offset = vrt_len_aligned;
auto const vrt_offset = 0;

glCreateBuffers(1, &buffer);
glNamedBufferStorage(buffer, ind_len_aligned + vrt_len_aligned, nullptr, GL_DYNAMIC_STORAGE_BIT);

glNamedBufferSubData(buffer, ind_offset, ind_len, ind_data);
glNamedBufferSubData(buffer, vrt_offset, vrt_len, vrt_data);

glCreateVertexArrays(1, &vao);
glVertexArrayVertexBuffer(vao, 0, buffer, vrt_offset, sizeof(vertex));
glVertexArrayElementBuffer(vao, buffer);

//continue with setup

And then when it's time to render:

glDrawElements(GL_TRIANGLES, ind_count, indices_format, (void *) ind_offset);

If you're curious why the pointer cast is necessary it's because in immediate mode you'd pass in your index buffer directly from host memory but in retained mode it's an offset into the buffer store.

Ideal Way Of Retrieving All Uniform Names

There is material out there that teach beginners to retrieve uniform information by manually parsing the shader source strings, please don't do this.

Here is how it should be done:

struct uniform_info
{ 
	GLint location;
	GLsizei count;
};

GLint uniform_count = 0;
glGetProgramiv(program_name, GL_ACTIVE_UNIFORMS, &uniform_count);

if (uniform_count != 0)
{
	GLint 	max_name_len = 0;
	GLsizei length = 0;
	GLsizei count = 0;
	GLenum 	type = GL_NONE;
	glGetProgramiv(program_name, GL_ACTIVE_UNIFORM_MAX_LENGTH, &max_name_len);
	
	auto uniform_name = std::make_unique<char[]>(max_name_len);

	std::unordered_map<std::string, uniform_info> uniforms;

	for (GLint i = 0; i < uniform_count; ++i)
	{
		glGetActiveUniform(program_name, i, max_name_len, &length, &count, &type, uniform_name.get());

		uniform_info_t uniform_info = {};
		uniform_info.location = glGetUniformLocation(program_name, uniform_name.get());
		uniform_info.count = count;

		uniforms.emplace(std::make_pair(std::string(uniform_name.get(), length), uniform_info));
	}
}

Note that the GLsizei size parameter refers to the number of locations the uniform takes up with mat3, vec4, float, etc. being 1 and arrays having it be the number of elements, the locations are arranged in a way that allows you to do array_location + element_number to find the location of an element, so if you wanted to write to element 5 it would be done like this:

glProgramUniformXX(program_name, uniforms["my_array[0]"].location + 5, value);

or if you want to modify the whole array:

glProgramUniformXXv(program_name, uniforms["my_array[0]"].location, uniforms["my_array[0]"].count, my_array);

Ideally UBOs would be used when dealing with collections of data larger than 16K as it may be slower than packing the data into vec4s and using glProgramUniform4f.

With this you can store the uniform datatype and check it within your uniform update functions.

Texture Atlases vs Arrays

Array textures are a great way of managing collections of textures of the same size and format. They allow for using a set of textures without having to bind between them.

They turn out to be good as an alternative to atlases as long as some criteria are met, that being all the sub-textures, or swatches, fit under the same dimensions and levels.

The advantages of using this over an atlas is that each layer is treated as a separate texture in terms of wrapping and mipmapping.

Array textures come with three targets: GL_TEXTURE_1D_ARRAY, GL_TEXTURE_2D_ARRAY, and GL_TEXTURE_CUBE_MAP_ARRAY.

2D array textures and 3d textures are similar but are semantically different, the differences come in where the mipmap level are and how layers are filtered.

There is no built-in filtering for interpolation between layers where the Z part of a 3d texture will have filtering available. The same goes for 1d arrays and 2d textures.

2d array

|layer 0 	|
|	level 1	|
|	level 2	|
|layer 1 	|
|	level 1	|
|	level 2	|
...

3d texture

|z off 0 	|
|z off 1 	|
|z off 2 	|
...
|	level 1 |
|	level 2 |
...

To allocate a 2D texture array we do this:

GLuint texarray = 0;
GLsizei width = 512, height = 512, layers = 3;
glCreateTextures(GL_TEXTURE_2D_ARRAY, 1, &texarray);
glTextureStorage3D(texarray, 1, GL_RGBA8, width, height, layers);

glTextureStorage3D has been modified to accommodate 2d array textures which I imagine is confusing at first but there's a pattern: the last dimension parameter acts as the layer specifier, so if you were to allocate a 1D texture array you would have to use glTextureStorage2D with height as the layer capacity.

Anyway, uploading to individual layers is very straightforward:

glTextureSubImage3D(texarray, mipmap_level, offset.x, offset.y, layer, width, height, 1, GL_RGBA, GL_UNSIGNED_BYTE, pixels);

It's super duper simple.

The most notable difference between arrays and atlases in terms of implementation lies in the shader.

To bind a texture array to the context you need a specialized sampler called samplerXXArray. We will also need a uniform to store the layer id.

#version 450 core

layout (location = 0) out vec4 color;
layout (location = 0) in vec2 tex0;

uniform sampler2DArray texarray;
uniform uint diffuse_layer;

float layer2coord(uint capacity, uint layer)
{
	return max(0, min(float(capacity - 1), floor(float(layer) + 0.5)));
}

void main()
{
	color = texture(texarray, vec3(tex0, layer2coord(3, diffuse_layer)));
}

Ideally you should calculate the layer coordinate outside of the shader.

You can take this way further and set up a little UBO/SSBO system of arrays containing layer id and texture array id pairs and update which layer id is used with regular uniforms.

Also, I advise against using ubos and ssbos for per object/draw stuff without a plan otherwise you will end up with everything not working as you'd like because the command queue has no involvement during the reads and writes.

As a bonus let me tell you an easy way to populate a texture array with parts of an atlas, in our case a basic rpg tileset.

Modern OpenGL comes with two generic memory copy functions: glCopyImageSubData and glCopyBufferSubData. Here we'll be dealing with glCopyImageSubData, this function allows us to copy sections of a source image to a region of a destination image. We're going to take advantage of its offset and size parameters so that we can copy tiles from every location and paste them in the appropriate layers within our texture array.

Image files loaded with nothings' stb_image header library.

Here it is:

GLsizei image_w, image_h, c, tile_w = 16, tile_h = 16;
stbi_uc* pixels = stbi_load(".\\textures\\tiles_packed.png", &image_w, &image_h, &c, STBI_rgb_alpha);
GLuint tileset;
GLsizei
	tiles_x = image_w / tile_w,
	tiles_y = image_h / tile_h,
	tile_count = tiles_x * tiles_y;

glCreateTextures(GL_TEXTURE_2D_ARRAY, 1, &tileset);
glTextureStorage3D(tileset, 1, GL_RGBA8, tile_w, tile_h, tile_count);

{
	GLuint temp_tex = 0;
	glCreateTextures(GL_TEXTURE_2D, 1, &temp_tex);
	glTextureStorage2D(temp_tex, 1, GL_RGBA8, image_w, image_h);
	glTextureSubImage2D(temp_tex, 0, 0, 0, image_w, image_h, GL_RGBA, GL_UNSIGNED_BYTE, pixels);

	for (GLsizei i = 0; i < tile_count; ++i)
	{
		GLint x = (i % tiles_x) * tile_w, y = (i / tiles_x) * tile_h;
		glCopyImageSubData(temp_tex, GL_TEXTURE_2D, 0, x, y, 0, tileset, GL_TEXTURE_2D_ARRAY, 0, 0, 0, i, tile_w, tile_h, 1);
	}
	glDeleteTextures(1, &temp_tex);
}

stbi_image_free(pixels);

Texture Views & Aliases

Texture views allow us to share a section of a texture's storage with an object of a different texture target and/or format. Share as in there's no copying of the texture data, any changes you make to a view's data is visible to all views that share the storage along with the original texture object.

Views are mostly indistinguishable from regular texture objects so you can use them as if they are.

The original storage is only freed once all references to it are deleted, if you are familiar with C++'s std::shared_ptr it's very similar.

We can only make views if we use a target and format which is compatible with our original texture, you can read the format tables on the wiki.

Making the view itself is simple, it's only two function calls: glGenTextures & glTextureView.

Despite this being about modern OpenGL glGenTextures has an important role here: we need only an available texture name and nothing more; this needs to be a completely empty uninitialized object and because this function only handles the generation of a valid handle it's perfect for this.

glTextureView is how we'll create the view.

If we were to have a typical 2d texture array and needed a view of layer 5 in isolation this is how it would look:

glGenTextures(1, &view_name);
glTextureView(view_name, GL_TEXTURE_2D, src_name, internal_format, min_level, level_count, 5, 1);

With this you can bind layer 5 alone as a GL_TEXTURE_2D texture.

This is the exact same when dealing with cube maps, the layer parameters will correspond to the cube faces with the layer params of cube map arrays being cubemap_layer * 6 + face.

Texture views can be of other views as well, so there could be a texture array, and a view of a section of that array, and another view of a specific layer within that array view. The parameters are relative to the properties of the source.

The fact that we can specify which mipmaps we want in the view means that we can have views which are just of those specific mipmap levels, so for example you could make textures views of the Nth mipmap level of a bunch of textures and use only those for expensive texture dependant lighting calculations.

ARB_texture_view

Setting up Mix & Match Shaders with Program Pipelines

  • Nvidia drivers have spotty performance as they lean towards monolithic shader programs, so it may be better suited for non-performance-critical applications.

Program Pipeline objects allow us to change shader stages on the fly without having to relink them.

To create and set up a simple program pipeline without any debugging looks like this:

const char*
	vs_source = load_file(".\\main_shader.vs").c_str(),
	fs_source = load_file(".\\main_shader.fs").c_str();
	
GLuint 
	vs = glCreateShaderProgramv(GL_VERTEX_SHADER, 1, &vs_source),
	fs = glCreateShaderProgramv(GL_FRAGMENT_SHADER, 1, &fs_source),
	pr = 0;
		
glCreateProgramPipelines(1, &pr);
glUseProgramStages(pr, GL_VERTEX_SHADER_BIT, vs);
glUseProgramStages(pr, GL_FRAGMENT_SHADER_BIT, fs);

glBindProgramPipeline(pr);

glCreateProgramPipelines generates the handle and initializes the object, glCreateShaderProgramv generates, initializes, compiles, and links a shader program using the sources given, and glUseProgramStages attaches the program's stage(s) to the pipeline object. glBindProgramPipeline as you can tell binds the pipeline to the context.

Because our shaders are now looser and flexible we need to get stricter with our input and output variables. Either we declare the input and output in the same order with the same names or we make their locations explicitly match through the location qualifier.

I greatly suggest the latter option for non-blocks, this will allow us to set up a well-defined interface while also being flexible with the naming and ordering. Interface blocks also need to match members.

As collateral for needing a stricter interface we also need to declare the built-in input and output blocks we wish to use for every stage.

The built-in block interfaces are defined as (from the wiki):

Vertex:

out gl_PerVertex
{
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];
};

Tesselation Control:

out gl_PerVertex
{
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];
} gl_out[];

Tesselation Evaluation:

out gl_PerVertex {
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];
};

Geometry:

out gl_PerVertex
{
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];
};

An extremely basic vertex shader enabled for use in a pipeline object looks like this:

#version 450

out gl_PerVertex { vec4 gl_Position; };

layout (location = 0) in vec3 pos;
layout (location = 1) in vec3 col;

layout (location = 0) out v_out
{
    vec3 col;
} v_out;

void main()
{
    v_out.col = col;
    gl_Position = vec4(pos, 1.0);
}

Faster Reads and Writes with Persistent Mapping

With persistent mapping we can get a pointer to a region of memory that OpenGL will be using as a sort of intermediate buffer zone, this will allow us to make reads and writes to this area and let the driver decide when to use the contents.

First we need the right flags for both buffer storage creation and the mapping itself:

constexpr GLbitfield 
	mapping_flags = GL_MAP_WRITE_BIT | GL_MAP_READ_BIT | GL_MAP_PERSISTENT_BIT | GL_MAP_COHERENT_BIT,
	storage_flags = GL_DYNAMIC_STORAGE_BIT | mapping_flags;

GL_MAP_COHERENT_BIT: This flag ensures writes will be seen by the server when done from the client and vice versa.

GL_MAP_PERSISTENT_BIT: This tells our driver you wish to keep the mapping through subsequent OpenGL operations.

GL_MAP_READ_BIT: Tells OpenGL we wish to read from the buffer.

GL_MAP_WRITE_BIT: Lets OpenGL know we're gonna write to it, if you don't specify this anything could happen.

If we don't use these flags for the storage creation GL will reject your mapping request with scorn. What's worse is that you absolutely won't know unless you're doing some form of error checking.

Setting up our immutable storage is straight forward:

glCreateBuffers(1, &name);
glNamedBufferStorage(name, size, nullptr, storage_flags);

Whatever we put in the const void* data parameter is arbitrary and marking it as nullptr specifies we wish not to copy any data into it.

Here is how we get that pointer we're after:

void* ptr = glMapNamedBufferRange(name, offset, size, mapping_flags);

I recommend mapping it as infrequently as possible because the process of mapping the buffer isn't particularly fast. In most cases you only need to do it just the once.

Make sure to unmap the buffer before deleting it:

glUnmapNamedBuffer(name);
glDeleteBuffers(1, &name);

If C++20 is available you can drop it into a std::span.

More information

About

A guide to using modern OpenGL functions.

License:Creative Commons Zero v1.0 Universal