Better grass support

Feedback on past, current, and future development.
User avatar
lysol
Posts: 1513
Joined: 26 Mar 2013, 01:48
Location: Sweden

Re: Better grass support

Post by lysol »

We know about VulkanSceneGraph. The main developer, Robert Osfield, has stated that you can't just change OSG to VSC overnight. It will require quite some work to get VSG instead of OSG to run the OpenMW graphics.

But we can all dream. I get the feeling though that most of the CPU issues can be solved with a good batching system since it is Morrowind's assets that create all the draw calls (which are handled by the CPU). So hopefully we can get OSG to get better CPU performance as well.
User avatar
FiftyTifty
Posts: 63
Joined: 15 Oct 2014, 21:02

Re: Better grass support

Post by FiftyTifty »

Time4Tea wrote: 09 Mar 2020, 12:51 Vulkan looks great. I get the impression OpenMW is quite CPU-limited, so anything that allows more extensive use of CPU parallelization would seem to be a good step. It looks like there is a 'sister' project to OSG - VulkanSceneGraph - currently being worked on. Wouldn't it be nice if the API was almost exactly compatible with OSG and all you had to do was run a script to replace the string 'osg' with 'vsg'? ;)
Bethesda's games have always been draw call limited. If you've ever seen your performance skyrocket when looking at the ground, that's because frustum culling has stopped many of the objects having been rendered. You can see this by using ProcessHacker and looking at the driver thread in use by the game's process.

Vulkan is many times more performant, and is embarrassingly parallel. It's of the same codebase as Direct3D 12, both of them being vendor agnostic variations of AMD's Mantle API. With a Vulkan renderer, and a CPU with more than a couple cores, it would be interesting to see how performance varies and what the bottlenecks then become.
User avatar
AnyOldName3
Posts: 2676
Joined: 26 Nov 2015, 03:25

Re: Better grass support

Post by AnyOldName3 »

Performance improving when you look down is an effect visible in most games. Everything has some degree of culling, and unless you're drawing something really simple compared to the rest of what your game is doing, drawing less is going to make it faster. This also doesn't mean that you're draw call limited. There's a fixed minimum cost per draw call, plus a fixed minimum cost per n triangles, plus a cost per pixel that gets covered by those triangles that depends on what kinds of shading you're doing. If you've got a lot of triangles in each draw call, or you have lots of overdraw (where the same pixel gets coloured in multiple times because multiple triangles all become the temporary current-closest), or your shaders are really complicated, you're going to see performance improve when you draw fewer things even if the actual cost of submitting the draw calls is a tiny fraction of your frametime. Specifically with later Elder Scrolls games, they don't make the most efficient use of draw calls that they could, but they don't make the most efficient use of any resource that they could except their customers' wallets, and that's probably an accident. There's a lot more low-hanging optimisation fruit available.

I'd also object to calling Vulkan embarrassingly parallel. That's a specific term that means a specific thing (roughly that doubling the number of cores available will halve the time something takes up to a ridiculous number of cores), and it doesn't apply here. In rendering, there'll always be things that need to happen sequentially, like drawing shadow maps before drawing shadows, or the main frame before post-processing, or having to actually finish the frame before displaying it, and some of these steps can't even happen in parallel with another. That's not a feature of embarrassingly parallel problems. Vulkan might not do anything to stop you submitting work from a bazillion different cores at once, unlike older APIs, but it doesn't magically mean you actually can either.

Also, I don't like the 'word' performant. Any time you can say it, you can also say fast, except when the phrase is more performant, in which case you can say faster. This is clearer, uses fewer syllables, and is less likely to make you sound like you're trying to show off. This isn't a technical correction to your post, though. It's just a pet peeve of mine.
User avatar
FiftyTifty
Posts: 63
Joined: 15 Oct 2014, 21:02

Re: Better grass support

Post by FiftyTifty »

AnyOldName3 wrote: 10 Mar 2020, 21:54 Performance improving when you look down is an effect visible in most games. Everything has some degree of culling, and unless you're drawing something really simple compared to the rest of what your game is doing, drawing less is going to make it faster. This also doesn't mean that you're draw call limited. There's a fixed minimum cost per draw call, plus a fixed minimum cost per n triangles, plus a cost per pixel that gets covered by those triangles that depends on what kinds of shading you're doing. If you've got a lot of triangles in each draw call, or you have lots of overdraw (where the same pixel gets coloured in multiple times because multiple triangles all become the temporary current-closest), or your shaders are really complicated, you're going to see performance improve when you draw fewer things even if the actual cost of submitting the draw calls is a tiny fraction of your frametime. Specifically with later Elder Scrolls games, they don't make the most efficient use of draw calls that they could, but they don't make the most efficient use of any resource that they could except their customers' wallets, and that's probably an accident. There's a lot more low-hanging optimisation fruit available.

I'd also object to calling Vulkan embarrassingly parallel. That's a specific term that means a specific thing (roughly that doubling the number of cores available will halve the time something takes up to a ridiculous number of cores), and it doesn't apply here. In rendering, there'll always be things that need to happen sequentially, like drawing shadow maps before drawing shadows, or the main frame before post-processing, or having to actually finish the frame before displaying it, and some of these steps can't even happen in parallel with another. That's not a feature of embarrassingly parallel problems. Vulkan might not do anything to stop you submitting work from a bazillion different cores at once, unlike older APIs, but it doesn't magically mean you actually can either.

Also, I don't like the 'word' performant. Any time you can say it, you can also say fast, except when the phrase is more performant, in which case you can say faster. This is clearer, uses fewer syllables, and is less likely to make you sound like you're trying to show off. This isn't a technical correction to your post, though. It's just a pet peeve of mine.
It's easy enough to tell if you're draw call limited, especially with lower polygon budgets. Chuck in a beefy GPU, and see how the framerate fares. Additionally, check the driver thread with ProcessHacker.

Vulkan, being a vendor-agnostic implementation of Mantle, will use all the cores you can throw at it. See: https://www.anandtech.com/show/7371/und ... pi-for-gcn & http://www.redgamingtech.com/investigat ... -analysis/

Embarrassingly parallel is probs a bit hyperbolic, but the nuance still stands. The Mantle-family of APIs will eat up all the cores you throw at them, dethroning draw calls from being the biggest bottleneck.
User avatar
AnyOldName3
Posts: 2676
Joined: 26 Nov 2015, 03:25

Re: Better grass support

Post by AnyOldName3 »

My points are that:
  • Things can be limited by things other than draw calls, even when the symptoms are very similar. If you put in a much faster GPU and find things are no faster, you could also be limited by anything else, such as physics on the CPU. The driver using a lot of CPU time isn't always an indicator of draw calls being the limiting factor, either - it's less of a problem these days than it used to be, but sometimes certain features of a graphics API are implemented in software by the driver instead of by hardware, or (more likely these days) you've tried doing something that forces a pipeline flush and the CPU is polling the GPU to ask when it's done so it can get a result back.
  • Vulkan helps, but isn't a magic bullet that fixes everything:
    • Removing one of the restrictions stopping something using more cores doesn't mean you've removed all the restrictions stopping something using more cores. use all the cores you can throw at it is a good phrase, because often you can't throw more cores at it, at least not without redesigning things.
    • If you still make a bazillion draw calls, even if they're split over a handful of cores and there's a reduced amount of state validation making each one cheaper, you've still got a significant fraction of a bazillion draw calls on each thread, and there's still a cost for each one, so it's still slower than drawing the same thing with fewer draw calls.
Sagacity
Posts: 31
Joined: 05 Mar 2019, 12:58

Re: Better grass support

Post by Sagacity »

AnyOldName3 wrote: 10 Mar 2020, 21:54 In rendering, there'll always be things that need to happen sequentially, like drawing shadow maps before drawing shadows, or the main frame before post-processing, or having to actually finish the frame before displaying it, and some of these steps can't even happen in parallel with another.
To be fair here, with the increasing popularity of raytracing and path tracers as full or near-full replacements to traditional raster bases, this may not be so true in the future. Ray tracing really is an embarrassingly parallel rendering process, even given it's sequential aspects. Throwing more cores at it yields almost linearly associated performance gains. Ironically, this is another nature of raytracing that I believe will push it to general adoption, aside from it's realism. The performance is far closer to linear with the number of shadow-casting light sources than ray tracing is.

Rasterization is great since it's performance is linear to the number of triangles drawn, and general the number of triangles you were drawing was the issue when we first started rendering things. But now we have a different issue; we're spending basically no time on general geometric and texturing passes, but almost all of our time on lighting and shading passes. Raytracing makes the lighting pass more linear, and simplifies the shading passes (since less tricks are required to simulate reality).
Chris
Posts: 1626
Joined: 04 Sep 2011, 08:33

Re: Better grass support

Post by Chris »

Sagacity wrote: 04 Apr 2020, 00:41 To be fair here, with the increasing popularity of raytracing and path tracers as full or near-full replacements to traditional raster bases, this may not be so true in the future. Ray tracing really is an embarrassingly parallel rendering process, even given it's sequential aspects. Throwing more cores at it yields almost linearly associated performance gains.
That's because more of the rendering work is per-pixel, and each pixel can be processed in parallel. The trade off is that the base performance is lower since each pixel can't share work (which would normally be preprocessed with an earlier pass), and the performance impact of increasing resolution is much higher since each added pixel needs to do all the same work as the others. And unless the increase in cores and core speed and memory speed outpaces the increase in pixels, you'll be looking at a net loss. As it is, GPUs can process a pass's pixels and vertices in parallel with typical rasterization*. Similarly, a raw ray-traced image is pretty rough and "noisy", needing heavy post-process work to clean it up for viewing.

* Though some apps use shaders and techniques that require sequential operation with pixels, which obviously limits parallelism. But that's on the app, not the hardware.
Sagacity
Posts: 31
Joined: 05 Mar 2019, 12:58

Re: Better grass support

Post by Sagacity »

Chris wrote: 04 Apr 2020, 03:30
Sagacity wrote: 04 Apr 2020, 00:41 To be fair here, with the increasing popularity of raytracing and path tracers as full or near-full replacements to traditional raster bases, this may not be so true in the future. Ray tracing really is an embarrassingly parallel rendering process, even given it's sequential aspects. Throwing more cores at it yields almost linearly associated performance gains.
That's because more of the rendering work is per-pixel, and each pixel can be processed in parallel. The trade off is that the base performance is lower since each pixel can't share work (which would normally be preprocessed with an earlier pass), and the performance impact of increasing resolution is much higher since each added pixel needs to do all the same work as the others. And unless the increase in cores and core speed and memory speed outpaces the increase in pixels, you'll be looking at a net loss. As it is, GPUs can process a pass's pixels and vertices in parallel with typical rasterization*. Similarly, a raw ray-traced image is pretty rough and "noisy", needing heavy post-process work to clean it up for viewing.

* Though some apps use shaders and techniques that require sequential operation with pixels, which obviously limits parallelism. But that's on the app, not the hardware.
Things like sparse sampling, spatial reprojection, and other smart sampling methods make the cost less per-pixel, but the point is that your ability to multithread the render engine is directly parallel to how many pixels you have on screen plus how many ray intersects you wish to calculate...which is just an embarrassingly large number.

To give a good example, if you want to do bidirectional path tracing you can calculate a ray for every pixel times the amount of lights plus one, all in parallel. At 1080p you'd have to have over 4 million cores to have more cores than possible threads. Even with path tracing you would need over 2 million.

Of course, various other factors change how parallel this process is. Things like denoisers can and will serialize the process. But the point is made, I believe, that ray tracing is stupidly parallel, and as we add complexity to the scene, the cost associated with ray tracing becomes more and more inline with the cost associated with rasterization.

While the base cost of ray tracing is high, the associated cost per effect is stupidly low. Rasterization is really good at providing a low cost for simpler scenes, but as lights and effects are added, the amount of performance decreases stack up quickly. The biggest two are shadows and reflections, but things like SAO and multiple viewports are another issue. Ray tracing can also use portals instead of having to use multiple framebuffers when utilizing multiple viewports, and non-euclidean geometry. An interesting example would be a non-euclidean house; and may facilitate preloading cells and loading them in without much graphical load added.
Chris
Posts: 1626
Joined: 04 Sep 2011, 08:33

Re: Better grass support

Post by Chris »

Sagacity wrote: 10 Apr 2020, 21:18 But the point is made, I believe, that ray tracing is stupidly parallel, and as we add complexity to the scene, the cost associated with ray tracing becomes more and more inline with the cost associated with rasterization.
While technically true, it's an uphill battle. Raytracing has the disadvantage of higher per-pixel costs (spatially and temporally), so normal rasterization is going to have an edge while pixel count and frame rate increases. Similarly, due to a fair amount of rasterization's costs being in effect pre-pass and post-process work, it's easier to scale down to weaker hardware by disabling effects, using simpler implementations, or updating intermediate buffers less often, while raytracing scales down mainly by reducing resolution and frame rate. And it's not as if normal rasterization won't also benefit from an increase in cores either, so at least for games, I don't see the cost:benefit ratio being very favorable.
Sagacity
Posts: 31
Joined: 05 Mar 2019, 12:58

Re: Better grass support

Post by Sagacity »

Chris wrote: 11 Apr 2020, 17:26
Sagacity wrote: 10 Apr 2020, 21:18 But the point is made, I believe, that ray tracing is stupidly parallel, and as we add complexity to the scene, the cost associated with ray tracing becomes more and more inline with the cost associated with rasterization.
While technically true, it's an uphill battle. Raytracing has the disadvantage of higher per-pixel costs (spatially and temporally), so normal rasterization is going to have an edge while pixel count and frame rate increases. Similarly, due to a fair amount of rasterization's costs being in effect pre-pass and post-process work, it's easier to scale down to weaker hardware by disabling effects, using simpler implementations, or updating intermediate buffers less often, while raytracing scales down mainly by reducing resolution and frame rate. And it's not as if normal rasterization won't also benefit from an increase in cores either, so at least for games, I don't see the cost:benefit ratio being very favorable.
I disagree greatly. There comes a point where, in order to get the desired effects, ray tracing becomes more efficient. This point is what I am discussing. This sort of point is just over the horizon, and is why hardware-accelerated ray tracing is becoming more and more relevant. Either with purpose-built hardware, or GPGPU compute. The fact of the matter is, as you ask for more physically accurate reflections and shadows, you quickly approach the point at which ray tracing becomes more efficient than rasterization, especially if the geometric complexity of the scene is high. Scaling down will certainly be relevant for the near future, but there will come a time where scaling down is just pointless, or the developers make an intentional choice not to scale down further. Having good scaling with resolution is also a boon to ray tracing since it means you can run specific effects at sub resolution to save performance, or even dynamically adjust resolution, for a very direct affect on performance. It's hard to optimize rasterized games, but with raytracing, the equation is rather simple. We can also show that it's perfectly possible to do realtime ray tracing right now, and the graphical leap can be astounding, especially when things like GI and shadow casters get mixed in together.
Post Reply