CPU and Single Core Implications

General discussion regarding the OpenMW project.
For technical support, please use the Support subforum.
User avatar
scrawl
Posts: 2085
Joined: 18 Feb 2012, 11:51
Contact:

Re: CPU and Single Core Implications

Post by scrawl » 04 Dec 2017, 22:44

I think you may have just outed yourself there.
Right, I agree; handling high-speed travels was definitely in my mind for the cell preloading, and from what I know, it works pretty well. I guess the point I was trying to make is that high speed combined with the broken 'prediction time' setting caused the loading issue, where as just high speed or just a bad config on its own would probably have been OK.
Viewing distance dosn't really cause much issues as long as cell load is 1.
Yes, that's what I expect.
I still have it set to 413455 for when I was loading 50 grids. Getting decent fps everywhere and a good view. I think when distant statics happen this will be really nice.
Cool. Thanks for reporting back!

Xenuria
Posts: 25
Joined: 26 Feb 2017, 22:35

Re: CPU and Single Core Implications

Post by Xenuria » 04 Dec 2017, 22:51

If all the cells can be loaded at once without the engine crashing, what's to say the world can't just be frozen?

What if you could tell the engine to just STOP, don't do anything that isn't already in memory. Then I would be able to do 60fps flybies of the continent with everything loaded at once. Maybe stop is the wrong word. Notice in the video even with everything loaded, AI turned off and scripts turned off it still seems to be chugging. Would be cool if you could freeze everything and just walk around in a world so still and silent.

Chris
Posts: 1299
Joined: 04 Sep 2011, 08:33

Re: CPU and Single Core Implications

Post by Chris » 04 Dec 2017, 23:13

Xenuria wrote:
04 Dec 2017, 22:17
The whole point of Morrowind was that you became a literal god and could jump continents, pickpocket from orbit, command entire towns with magic, etc. It's written into the main quest and is a core part of the story.
Eh, no. Just because the game was buggy and easily exploitable doesn't mean that's the point of the game. That may be how you play, but veterans to a game often play differently than intended because they're so used to the mechanics (Morrowind/OpenMW isn't unique to this). It's also stated nowhere in the game that you become a god; in fact, the whole point of the MQ was for you to bring down the false gods because Azura didn't like upstarts, and Vivec purposely never taught you the final step to achieve apotheosis with the heart since you needed to destroy/banish it to stop Dagoth Ur.
What if you could tell the engine to just STOP, don't do anything that isn't already in memory.
Because it's not just a matter of being in memory. The engine also has to tell the GPU what to draw, and the more its told to draw the more work the CPU needs to do (regardless of any AI or scripts). It's a well-known problem in OpenGL and D3D that to draw something, the driver and GPU need to do state validation because something could be set improperly (even if nothing is), and the state validation isn't cheap, so the more you draw the more unnecessary validation it needs to do. Morrowind's assets weren't designed to be efficient with the number of draw calls because it wasn't intended for many things to be visible at once. With Vulkan and D3D12 (and OpenGL4 with extensions) it's possible to alleviate the amount of state validation by minimizing the amount of necessary changes, preloading state onto the GPU, and doing "indirect draws" while poking GPU memory directly, but that requires designing the renderer around those capabilities.

CMAugust
Posts: 65
Joined: 10 Jan 2016, 00:13

Re: CPU and Single Core Implications

Post by CMAugust » 05 Dec 2017, 04:12

scrawl wrote:
04 Dec 2017, 17:38
Probably not from me, or at least I can't think of any ATM; apart from Vulkan, which would be a very big undertaking. There was some talk of a vulkan-based successor to OpenSceneGraph, but not much seems to have happened on that front yet.
Really? Some of the ideas in your OpenMW Trello sounded good. Then again, I wouldn't know which would actually constitute a significant performance increase. That, and they haven't been revisited since 2015...
Lagahan wrote:
04 Dec 2017, 20:55
Distant terrain is also very new to the OSG renderer and nowhere near optimized so just steer clear of it for now.
I hadn't heard this. So there's still plenty of room for improvement? I've noticed increased (or at least exacerbated) stuttering using it so further optimisations would be very welcome.

User avatar
psi29a
Posts: 3452
Joined: 29 Sep 2011, 10:13
Github profile: https://github.com/psi29a/
Contact:

Re: CPU and Single Core Implications

Post by psi29a » 05 Dec 2017, 09:23

As I mentioned earlier, and was dismissed by Lagahan, but there is still work to do on the Physics side of things. This is single threaded and when there are too many collisions happening, can cause your FPS to plummet because of how the physics simulation is linked to what is seen when rendered. This is very hard to get right to begin with and adding threads isn't going to solve the problem it is just going to make maintenance a nightmare. That is why OpenMW is focused on being feature complete first and worry about performance later.

We'll get there. :)

CMAugust
Posts: 65
Joined: 10 Jan 2016, 00:13

Re: CPU and Single Core Implications

Post by CMAugust » 05 Dec 2017, 13:31

@psi29a, maybe I just don't know enough about how the profiler or game logic works, but I'm not so sure that's the case anymore - at least in vanilla gameplay. I decided to do some testing, perched in the air above Seyda Neen, Ebonheart and Sadrith Mora. I took screenshots looking up, looking down, and to remove physics from the equation (or partly at least), paused and unpaused. The latter city is the worst of the bunch when it comes to physics in-game. Here it is:

Sadrith Mora, looking down
Sadrith Mora, looking up
Sadrith Mora, looking down (game paused)
Sadrith Mora, looking up (game paused)

In my testing across the three regions, draw calls take way more time than physics in nearly all cases. If draw calls actually was affected by whatever physics is doing (and I may just be misinterpreting your wording here), shouldn't we see an obvious drop in draw calls when the game is paused?

Compare that to what was happening before the collision improvements, which is exactly what you'd expect to see with a physics bottleneck. The best I can do to smash physics these days is run into this poorly optimized mesh at high speed, which isn't ideal, but shows how much better things have got on the physics front.

This on the other hand is the new king of bad performance, and extended view distance with distand terrain just makes it worse. I hope there are still a few tricks left in the bag for OpenMW to use without needing a ton of extra work.

User avatar
psi29a
Posts: 3452
Joined: 29 Sep 2011, 10:13
Github profile: https://github.com/psi29a/
Contact:

Re: CPU and Single Core Implications

Post by psi29a » 05 Dec 2017, 13:47

Yes, there are many other areas that can use some love. Adding distant statics is going to add yet more load to the scene graph. There is shadows being worked as well, that is going add additional load on the scene graph. These are all GPU related things.

However, for me, these things are eye-candy and can be turned on/off. You can't turn physics (CPU bound single process) off and be able to play the game with any kind of satisfaction. It is so integral to OpenMW that, if this thread gets overloaded with simulations, FPS will die regardless of your viewing distance or how powerful your GPU is. Even things as simple as jumping with forward momentum up a hill will cause interpolation between frames that can cause a harsh jump between 30 and 60fps. You might not see this, you might have a more powerful CPU who's single-threaded performance is superior to my 2nd gen i7, I pity people with AMD CPUs. That doesn't mean the issue isn't there.

To my eyes, this particular sub-system, is its Achilles heel.

CMAugust
Posts: 65
Joined: 10 Jan 2016, 00:13

Re: CPU and Single Core Implications

Post by CMAugust » 05 Dec 2017, 14:53

Well, you can't just throw a bigger GPU at draw calls can you? As Lagahan's video demonstrated, not even a GTX 1080ti can keep OpenMW with only a modest distance increase above 60 in the problem areas today. But I accept your point that physics is still a problem for weak CPUs; I've only done testing with my i5 2500k clocked at stock 3.3 Ghz.

User avatar
scrawl
Posts: 2085
Joined: 18 Feb 2012, 11:51
Contact:

Re: CPU and Single Core Implications

Post by scrawl » 05 Dec 2017, 15:20

When a game is released, its assets tend to be made so that CPU and GPU are stressed equally. If you have more room on the CPU, you add more different assets, if you have more room on the GPU, then you add more vertices, shaders, larger textures etc.

But as the game ages, advances in new GPUs tend to be much faster than in new CPUs, so it's only normal that older games will be limited by CPU.
Really? Some of the ideas in your OpenMW Trello sounded good. Then again, I wouldn't know which would actually constitute a significant performance increase. That, and they haven't been revisited since 2015...
Let's see:
'Multithreaded skinning', 'Particle rewrite' and 'Balanced scene graph' would have helped the Cull thread, if that was the bottleneck. Or we could improve the camera threading so that water reflections can be culled in another thread. Again, only helps if culling is a bottleneck.
'Shader lighting optimizations' would help the GPU thread, if it was the bottleneck, and if shaders are even enabled.

But from what I'm seeing in this thread, the Draw thread seems to be the bottleneck for most people, and I don't have many ideas for that.
It's possible that opting for Vertex Array Objects instead of display lists (to be released with OSG 3.6) would help with AMD GPU's (I know it doesn't help for Nvidia, display lists are still king there).

Really we'd need some form of batching to properly improve perforance with OpenGL, only that's a massive PITA for Morrowind because static/non-static objects aren't designated in any way, there are many lights that can move at any time, objects can be disabled by scripts, etc. It's possible to do this, just a lot of work and there would probably be some additional stuttering and lighting glitches, as in the Ogre3D version. Couple this with the fact that such work will be obsolete when Vulkan comes around and you can see why I'm not keen on trying.

User avatar
MajinCry
Posts: 39
Joined: 15 Oct 2014, 21:02

Re: CPU and Single Core Implications

Post by MajinCry » 05 Dec 2017, 16:36

scrawl wrote:
05 Dec 2017, 15:20

But from what I'm seeing in this thread, the Draw thread seems to be the bottleneck for most people, and I don't have many ideas for that.
It's possible that opting for Vertex Array Objects instead of display lists (to be released with OSG 3.6) would help with AMD GPU's (I know it doesn't help for Nvidia, display lists are still king there).

Really we'd need some form of batching to properly improve perforance with OpenGL, only that's a massive PITA for Morrowind because static/non-static objects aren't designated in any way, there are many lights that can move at any time, objects can be disabled by scripts, etc. It's possible to do this, just a lot of work and there would probably be some additional stuttering and lighting glitches, as in the Ogre3D version. Couple this with the fact that such work will be obsolete when Vulkan comes around and you can see why I'm not keen on trying.
That's only if you use the bog-standard batching methods. Traditional static batching has a very narrow use case, and traditional dynamic batching is incredibly wasteful (repeatedly batching every mesh every frame). The creator of ENBSeries explained how to create a more usable, robust, and performant batching system over on this thread: http://enbseries.enbdev.com/forum/viewt ... 55e#p69747
Pack meshes in few big vertex buffers and add data to each inside it or in external stream which identify individual mesh, for example one float value. Use that value to index to any data, for example matrices for transformation, which will be selected in shader similar to matrices of bones. Works fine with dynamic objects and all you need is to remove deleted meshes or add new to large buffers. Can be done on hardware too via other tricks (don't ask that please).
CPU - Phenom 965 BE @ 3.4GHz UV'd @ 1.2875V
GPU - 7850 2GB GDDR5.
RAM - 4x4GB DDR3 1333MHz @ 7-7-7-21
Mobo - ASROCK AM3 M3N78D
Soundcard - Creative Soundblaster X-Fi Titanium Fatal1ty Pro
OS - Win. 7 64bit.

Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests