Unlimited Light Sources with Clustered Forward Shading

Feedback on past, current, and future development.
CMAugust
Posts: 168
Joined: 10 Jan 2016, 00:13

Unlimited Light Sources with Clustered Forward Shading

Post by CMAugust » 04 Mar 2019, 11:24

Sometimes, the environments in vanilla Morrowind feature too many lights for OpenMW to handle gracefully. If you ever encounter lights that seem to blink on or off depending on where you're standing, this is probably the reason why. A limitation of gl_Lights is that only 8 lights can affect any given object. This may sound familiar to users of the Skyrim Creation Kit, where the same limitation exists. The typical solution for game developers has been to use Deferred Shading, which lets them have as many lights as they want. Unfortunately, Deferred Shading has a number of downsides, and this is especially true for OpenMW as Morrowind and its assets were developed with Forward Shading in mind.

Enter Clustered Forward Shading, an evolution of Tiled (Forward+) Shading that allows for an unlimited number of lights while still using the forward rendering path. Extended further, this clustering technique can enable hundreds of omnidirectional dynamic shadow casting lights in real time. The most prominent sightings of Clustered Shading in AAA games have been DOOM(2016), Just Cause 3 and Detroit: Become Human.

There has been the occasional mention of Clustered Shading here over the years (mostly by Chris, it must be said) but it didn't lead to any substantial discussion, and no topic specifically about it has been made. I'm opening this thread to provoke interest and discussion, and to provide as many useful references to Clustered Shading as I can find - videos, slides, and even source code available under permissive license.


Youtube: Real-time many-light management and shadows with clustered shading (SIGGRAPH 2015 Courses) (slides available here)
The big one! Several back-to-back courses with principal developers Ola Olsson, Markus Billeter and Emil Persson.

Research Paper: More Efficient Virtual Shadow Maps for Many Lights (Alternative link)

Youtube: Game Dev Brisbane 2016 - Managing many lights in real time with clustered shading by Ola Olsson (slides available via his website)
As above, but Ola Olsson's material only. Includes Shadow presentation.

SIGGRAPH 2014 presentation (slides): University of Zurich Archive - Efficient Real-Time Shading with Many Lights by Ola Olsson, Markus Billeter and Emil Persson


GitLab Repository: Clustered Forward Shading Demo by Ola Olsson

Graphic Analysis Blog: DOOM (2016) - Graphics Study by Adrian Courrèges

SIGGRAPH DOOM(2016) Presentation (slides): The Devil is in the Details: idTech 666

Avalanche Studios Presentation for Just Cause 3 (slides): Practical Clustered Shading by Emil Persson

New from 2018
GDC Vault (slides): Cluster Forward Rendering and Anti-Aliasing in 'Detroit: Become Human' by Ronan Marchalot

New from 2018
GitHub repository (MIT licence): Hybrid (Clustered Forward/Deferred) Rendering Engine by Angel Ortiz

Youtube: Volume Tiled Forward Shading by Jeremiah van Oosten
Demonstration of the various shading methods in action with profiler, including a custom method based on Tiled/Clustered Shading.
Last edited by CMAugust on 05 Mar 2019, 14:51, edited 1 time in total.

User avatar
afritz1
Posts: 41
Joined: 05 Sep 2016, 01:18
Contact:

Re: Unlimited Light Sources with Clustered Forward Shading

Post by afritz1 » 04 Mar 2019, 17:11

I'd be interested in looking into implementing some kind of clustered forward or forward+ rendering for my Arena engine, since there's a few places in Arena where there's more than 5 or 6 lights touching the same space and it'd probably be slow to handle that the naive way. It might help a lot with performance when the player throws a lot of spells at the same time, too.

Maybe Arena doesn't need this kind of lighting though, because with its shading model, the maximum magnitude of light at a point is clamped to 1.0, and once it reaches that value, it stops sampling lights for that pixel. So the worst case scenario would be if N lights were all touching a point such that the individual contributions were 1/N.

I only have an elementary understanding of clustered forward rendering (split the frustum up into variable-sized chunks), and it'd need to be reasonably easy for me to learn over a few weeks. I would like my software renderer to run fast despite having several lights on-screen and I think this might be a good way to go, although I still have a lot to learn.

User avatar
AnyOldName3
Posts: 1477
Joined: 26 Nov 2015, 03:25

Re: Unlimited Light Sources with Clustered Forward Shading

Post by AnyOldName3 » 04 Mar 2019, 18:20

There's a number of lights where it's still better to use classical forward rendering than something more complicated and I'd be surprised if Arena ever got that high. I don't think lots of the schemes for optimising rendering on GPUs are even applicable for a software renderer - you don't have to worry about taking a different amount of time to render a pixel than its neighbours, for a start.
AnyOldName3, Master of Shadows

User avatar
afritz1
Posts: 41
Joined: 05 Sep 2016, 01:18
Contact:

Re: Unlimited Light Sources with Clustered Forward Shading

Post by afritz1 » 04 Mar 2019, 18:59

You're probably right, I guess I was just curious if it was even something worth looking into. It seems pretty complex after reading some of the Detroit: Become Human GDC slides (although that's the AAA way of doing it -- of course it's going to seem complex. I was looking more for the "simplest method" to get that algorithmic benefit). https://www.gdcvault.com/play/1025420/C ... g-and-Anti

I'm not sure how clustered forward or deferred rendering would help though since most of Arena's lights are big. From what I'm reading, deferred rendering is best when you have a lot of small lights. I'm not aware of a more practical way to optimize lighting yet; i.e., if I have a triangle in space with some lights around it, I'm just going to check which ones are visible to the front face of the triangle, and then do a for loop over them.

I wonder how Arena even renders lights in the original engine. It appears to split the screen up into tiles and each tile has some progressively-refined resolution depending on how much the light changes over the tile, I guess?

Maybe it would be worth doing things the original way I was thinking: loop over every nearby light and return early if the light magnitude ever reaches 1.0. That might be good enough. I recently upgraded to a 4k monitor and I'm now a bit more concerned about the price of per-pixel shading in my software renderer. That's one reason why I joined this discussion :P

User avatar
MajinCry
Posts: 40
Joined: 15 Oct 2014, 21:02

Re: Unlimited Light Sources with Clustered Forward Shading

Post by MajinCry » 04 Mar 2019, 20:32

Deferred Shading is not as specific in use as just for when there are many small lights. With DS, draw calls concerning lights are only submitted for as many objects that are being affected by a given light. So if you have five lights in the scene, of 300 unlit objects with 30 objects being affected by all 5, we end up with (5 x 30 = ) 150 added draw calls, for a total of 450.

But with forward rendering, that same scene would issue (330 * 5 = ) 1650 draw calls. Which is extremely bad. It might be passable on a Direct3D 11 renderer with a Ryzen or Intel xLake CPU, but on OpenGL, even with a hot CPU? Yikes.

Forward rendering is really only used for games that don't have lights, and if they do, the scenes they're present in are extremely barebones. As a comparison, Ocarina of Time uses forward rendering, but Wind Waker uses deferred rendering.

You can also look at Oblivion's utterly terrible performance. Ever wondered why your framerate just plummets when it's raining, and a guard walks towards you with his torch's light fading in? Because the moment that light came into the scene, the entire scene was redrawn. Even if it only affects a few objects, the forward renderer doesn't care, and will re-render everything.
CPU - Phenom 965 BE @ 3.4GHz UV'd @ 1.2875V
GPU - 7850 2GB GDDR5.
RAM - 4x4GB DDR3 1333MHz @ 7-7-7-21
Mobo - ASROCK AM3 M3N78D
Soundcard - Creative Soundblaster X-Fi Titanium Fatal1ty Pro
OS - Win. 7 64bit.

User avatar
wareya
Posts: 335
Joined: 09 May 2015, 13:07

Re: Unlimited Light Sources with Clustered Forward Shading

Post by wareya » 04 Mar 2019, 20:55

I'd like to emphasize that comparing modern rendering architectural patterns to anything that an N64 game did is a very bad idea. If you're talking about the 3DS remake though then you might be right, I don't know.
paying attention to #1751 #2473 #3609 #3862/#3929 #3807 #4297 #4623

User avatar
AnyOldName3
Posts: 1477
Joined: 26 Nov 2015, 03:25

Re: Unlimited Light Sources with Clustered Forward Shading

Post by AnyOldName3 » 04 Mar 2019, 21:07

MajinCry wrote:
04 Mar 2019, 20:32
Deferred Shading is not as specific in use as just for when there are many small lights. With DS, draw calls concerning lights are only submitted for as many objects that are being affected by a given light. So if you have five lights in the scene, of 300 unlit objects with 30 objects being affected by all 5, we end up with (5 x 30 = ) 150 added draw calls, for a total of 450.

But with forward rendering, that same scene would issue (330 * 5 = ) 1650 draw calls. Which is extremely bad. It might be passable on a Direct3D 11 renderer with a Ryzen or Intel xLake CPU, but on OpenGL, even with a hot CPU? Yikes.

Forward rendering is really only used for games that don't have lights, and if they do, the scenes they're present in are extremely barebones. As a comparison, Ocarina of Time uses forward rendering, but Wind Waker uses deferred rendering.

You can also look at Oblivion's utterly terrible performance. Ever wondered why your framerate just plummets when it's raining, and a guard walks towards you with his torch's light fading in? Because the moment that light came into the scene, the entire scene was redrawn. Even if it only affects a few objects, the forward renderer doesn't care, and will re-render everything.
Nearly every word of this is wrong, which is quite a feat.

Forward rendering sets up the lights (either by assigning state to the fixed function lights or by setting up uniforms) then draws things (with one draw call per object) and potentially changes which lights are active from object to object. The problem is not with draw calls, but with having to iterate over every active light in the shader, which takes a lot of time for no reason if most of the lights aren't near the object being drawn.

Deferred rendering draws the objects (with one draw call each) but doesn't colour them in immediately. Instead, it stores details of what the material in each pixel is like, and then uses that data to actually colour in the pixel later. When you're colouring in, you no longer care about which objects are which, because you basically have a single object that covers the whole framebuffer. You do the colouring in by drawing a volume representing where each light affects, and for the pixels this gets drawn to, the shader adds the contribution of the light to the final framebuffer.

Neither really aims to cut back on draw calls, but both aim to make the draw calls there are be as simple and quick as possible.
AnyOldName3, Master of Shadows

CMAugust
Posts: 168
Joined: 10 Jan 2016, 00:13

Re: Unlimited Light Sources with Clustered Forward Shading

Post by CMAugust » 04 Mar 2019, 23:03

Furthermore, there is a short demonstration in the OP showcasing the performance differential between classsical/tiled/clustered forward methods - and more are provided in the lengthier videos and slides. I'll link more to this topic as I find them.

Sagacity
Posts: 8
Joined: 05 Mar 2019, 12:58

Re: Unlimited Light Sources with Clustered Forward Shading

Post by Sagacity » 05 Mar 2019, 13:39

Nearly every word of this is wrong, which is quite a feat.
To add to this; there are many highly performing games that use forward rendering, or some variant thereof. Doom 2016 specifically uses what OP is suggesting OpenMW use. It certainly runs well and has lots of lights. That doesn't necessarily mean it is a good application for Morrowind, but the suggestion that Forward Rendering is what makes/made Oblivion run slow is...well it's ignorant to be frank.

With all of that said; both Forward and Deferred shading suffer from performance decreases directly associated with the number of lights in a scene. Deferred lighting attempts to solve this scaling issue by calculating lights all in one big fell swoop. This is useful in that makes it so that the performance of a light calculation is not associated with the number of lights in the scene.

It however has its own issues. Transparency and shadows essentially nullify all advantages deferred shading has over forward shading. Transparency, chiefly, forces you to use forward shading as a fall-back for when deferred shading simply will not work. Similarly, you must do extra lighting passes when computer shadows, and the associated complexity is equal to, if not larger, than forward shading. You also don't have to store as much framebuffer data when computing using a forward shading technique.

Techniques such as clustered forward shading basically entirely solve forward shadings' issues without incurring the disadvantages deferred incurs. This means that you can basically have your cake and eat it too.

But really, in the end, the most important thing to note here is that Morrowind, and really any Elder Scrolls game is nowhere near close to reaching the limits of any of these shading techniques. The only game that does is Fallout 4. Forward Shading on really begins to crumble on modern hardware when going near or above 100s of lights in a scene. Deferred and thousands of lights, and Clustered even more.

In my opinion Clustered Forward would be the best for OpenMW as it does not incur a high fixed cost that we see in deferred rendering, and also avoids excess VRAM usage. It also allows a high amount of lights, far more than any future Bethesda RPG will conceivably need, and works on DX10 hardware.

Ultimately, AnyOldName3 is the graphics guru, and he is more intimately in tune with the engine's graphical back-end than I am, so I trust him to be making the correct decisions. Or at least, as close as possible. At the end of the day; this being an open source project means that us making a mistake now does not screw us over forever.

Also, draw calls really aren't an issue in any of the games OpenMW targets. It's really a matter of shader complexity, or CPU bottleneck. We're simply not drawing enough independent crap on screen for draw calls to become an issue. Modern graphics APIs permit millions of draw calls. To put that in perspective, Fallout 4 rarely goes above 100,000 draw calls. That's assuming you're sitting in a settlement with tons and tons of shadow casting lights.

CMAugust
Posts: 168
Joined: 10 Jan 2016, 00:13

Re: Unlimited Light Sources with Clustered Forward Shading

Post by CMAugust » 05 Mar 2019, 14:48

Adding TVCG research paper More Efficient Virtual Shadow Maps for Many Lights that was mentioned in the SIGGRAPH 2015 course to the OP.
(Alternative link)

In this paper, we explore the use of hardware-supported virtual cube-map shadows to efficiently implement high-quality shadows from hundreds of light sources in real time and within a bounded memory footprint ... Our solution supports real-time performance with hundreds of lights in fully dynamic high-detail scenes.

Post Reply