GodOfMemes / HelloVulkan

πŸŒ‹πŸ––πŸ½ PBR, IBL, Clustered Forward Shading, Bindless Textures, Shadow Mapping, and more!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

πŸŒ‹Hello EngineπŸ––πŸ½

A 3D rendering engine built from scratch using Vulkan API and C++.


Features

  • Clustered forward shading for efficient light culling.
  • Physically-Based Rendering (PBR) with Cook-Torrance microfacet.
  • Image-Based Lighting (IBL) pipelines that generate:
    • A cubemap from an equirectangular HDR image.
    • Specular and diffuse cubemaps.
    • BRDF lookup table.
  • Bindless:
    • A single indirect draw call per render pass.
    • Descriptor indexing that allows all textures in the scene to be bound just once at the start of the frame.
    • Buffer device address for direct shader access to buffers without the need to create descriptors.
  • Compute-based Frustum Culling.
  • Compute-based Skinning for skeletal animation.
  • Shadow maps with Poisson Disk or PCF.
  • glTF mesh/texture support.
  • Multisample anti-aliasing (MSAA).
  • Simple raytracing pipeline with basic intersection testing.
  • Tonemap postprocessing.
  • Automatic runtime compilation from GLSL to SPIR-V using glslang.
  • Lightweight abstraction layer on top of Vulkan for faster development.
  • Additional features: skybox, infinite grid, line rendering, and ImGui / ImGuizmo.

Engine Overview

The engine leverages several modern GPU features to optimize rendering performance. First, bindless textures is achieved by utilizing descriptor Indexing. This enables the storage of all scene textures inside an unbounded array, which allows texture descriptors to be bound once at the start of a frame.

Next, the engine takes advantage of indirect draw API. This means the CPU only calls a single indirect draw command. By sorting draw calls based on material type, it is now possible to have separate render passes for each material. For each render pass, the GPU only processes a draw call batch of objects sharing the same material. This significantly improves efficiency because shader branching can now be avoided.

Finally, the engine pushes the concept of "bindless" even further by utilizing buffer device addresses. Instead of creating descriptors, device addresses act as pointers so that the shaders can have direct access to buffers.

The images below showcase the implementations of PBR, IBL, and PCF shadow mapping.

bindless_shadow_mapping_1 bindless_shadow_mapping_2


The video below is another example of realistic rendering of the damaged helmet demonstrating PBR and IBL techniques.

vulkan_helmet.mp4

Clustered Forward Shading

The technique consists of two steps that are executed in compute shaders. The first step is to subdivide the view frustum into AABB clusters. The next step is light culling, where it calculates lights that intersect the clusters. This step removes lights that are too far from a fragment, leading to reduced light iteration in the final fragment shader.

Preliminary testing using a 3070M graphics card shows the technique can render a PBR Sponza scene in 2560x1600 resolution with over 1000 dynamic lights at 60-100 FPS. If too many lights end up inside the view frustum, especially when zooming out, there may be a drop in frame rate, but still much faster than a naive forward shading.

vulkan_cluster_forward.mp4

Compute-Based Frustum Culling

Since the engine uses indirect draw, frustum culling can now be done entirely on the compute shader by modifying draw calls within an indirect buffer. If an object's AABB falls outside the camera frustum, the compute shader will deactivate the draw call for that object. Consequently, the CPU is unaware of the number of objects actually drawn. Using Tracy profiler, an intersection test with 10,000 AABBs only takes less than 25 microseconds (0.025 milliseconds).

The left image below shows a rendering of all objects inside the frustum. The right image shows visualizations of AABBs as translucent boxes and the frustum drawn as orange lines.

frustum_culling

Compute-Based Skinning

The compute-based skinning approach is much simpler than the traditional vertex shader skinning. This is because the skinning computation is done only once using a compute shader at the beginning of the frame. The resulting skinned vertices are then stored in a buffer, enabling reuse for subsequent render passes like shadow mapping and lighting. Consequently, there is no need to modify existing pipelines and no extra shader permutations.

compute_skinning.mp4


Cascade Shadow Maps

The left image below is a rendering that uses four cascade shadow maps, resulting in sharper shadows. The right image above showcases the individual cascades with color coding. Poisson disk sampling helps to reduce projective aliasing artifacts, but can create more noticeable seams between cascades with excessive blurring.

cascade_shadow_mapping

Hardware-Accelerated Raytracing

The engine also features a raytracing pipeline. This process begins with building Bottom Level Acceleration Structures (BLAS) containing muitple geometries, then followed by creating Top Level Acceleration Structures (TLAS). For each pixel on the screen, a ray is cast and intersected with the acceleration structures to determine the final color.

hardware_raytracing


Build


Credit

About

πŸŒ‹πŸ––πŸ½ PBR, IBL, Clustered Forward Shading, Bindless Textures, Shadow Mapping, and more!

License:MIT License


Languages

Language:C++ 86.3%Language:GLSL 11.4%Language:C 1.3%Language:CMake 0.9%