Optimisation

Everything about development and the OpenMW source code.
Locked
User avatar
lgromanowski
Site Admin
Posts: 1193
Joined: 05 Aug 2011, 22:21
Location: Wroclaw, Poland
Contact:

Optimisation

Post by lgromanowski » 13 Aug 2011, 21:22

gus wrote: To be honest, i had considered a very simple optimization: put everything that is in a cell in a single staticGeometry (except NPCs).There would be one static geometry per cell. This should fairly simple to do, and provide a good performance boost.

But maybe you have other ideas.
Zini wrote: I gave you a couple of options in the other thread. I don't know if they are viable. I think this feature needs a lot of experimentation. Implement different variants and measure (preferably measure in multiple locations).
gus wrote: stupid question: how to display FPS?
Zini wrote: Run openmw with the --showfps switch or add showfps to the openmw.cfg file.
gus wrote: The problem is more complicated than i thought. Just changing to staticGeometry had no effect at all. So I took a GPU profiler, and it seems that some object(1 or 2 in interior cells,5/6 in exterior), no necessarily complicated one, take most of the GPU time...
Zini wrote:
Just changing to staticGeometry had no effect at all.
That probably means, that something is wrong here. Reducing the number of batches should increase the performance notably.

I can only guess here, but from the OGRE static geometry documentation:
Warning: this class only works with indexed triangle lists at the moment, do not pass it triangle strips, fans or lines / points, or unindexed geometry.
Do we know how the nif files are transformed into OGRE mesh data? Do they match the criteria given above?
gus wrote:
Do we know how the nif files are transformed into OGRE mesh data? Do they match the criteria given above?
Unless i'm mistaken ,they do.
That probably means, that something is wrong here. Reducing the number of batches should increase the performance notably.
I am not that sure.
In fact, one or two object takes 99% of the rendering time, and they have nothing particular like a huge number of vertex or shader etc. I think i should ask in the Ogre forum what could be the cause of it, because with batch or without batch, it's definitely not a normal behavior.
Zini wrote: In Morrowind NPCs consume a large part of the rendering power, because they have a lot more faces than most other meshes. But it certainly shouldn't be 99&. Can you somehow identify which objects are causing this problem?
gus wrote: Yup. I can't see the name, but I can see the mesh. It's not NPCs. It seems to be random objects, unrelated with their complexity. Sometimes, there are 6000 vertex, other 40...

I tried (to be sure) to set very low LOD, it did not change FPS, even if there was something like twice as less vertex.
Zini wrote: Sounds like a bug in the NIF -> mesh converter.

Edit: Or maybe not. After reading your last posting, I think I might have misunderstood what you wanted to say.
gus wrote:
Sounds like a bug in the NIF -> mesh converter.

Edit: Or maybe not. After reading your last posting, I think I might have misunderstood what you wanted to say.
Maybe it is^^ it might be some buffer allocated in the wrong memory or something.
gus wrote: I managed to have a *2 speed up in vivec, because a different materials was created for each mesh, and staticGeometry can only batch geometry with the same materials.

Also, i located one "buggy" mesh. You were right, it's part of an NPC. But there must be a bug in the creation of the mesh, because it has only 140 vertexes and it takes most of the GPU time(it takes 7 ms to process, which is huge!)

Edit: is there an option to disable NPC rendering?
Zini wrote: Not really. But you could simply comment out the body of the Npc:insertObj function (in apps/openmw/mwclass/npc.cpp).
gus wrote: I tried to just load NIF objects outside of OpenMW. There is no strange slow down in this case, so the slow down doesn't come from the NIFLoader code. There must be something else in OpenMW slowing everything down. I don't know what yet.
Zini wrote: Well, the obvious culprit would be Npc::insertObj in apps/openmw/mwclass.

This is a pretty huge function and the only place where NPCs are different from other objects. To be honest I have no idea how exactly this function is working.
gus wrote: Well, it isn't. I just commented out the whole function (nasty xD), but FPS is as low as usual.
Zini wrote: So the meshes that cause the slowdown actually aren't NPCS? Because without this function nothing should be rendered for the NPC-references.
gus wrote: It seems. In fact, i made a dirty hack so that every mesh loaded is the same .nif in order to simplify profiling, but it might have been something related to skelton or something like that.
I'm trying to disable as much things as possible to eliminate possible cause of the slow down. Right now, i also disabled lights, sky and fog.
Peppe wrote: Don't visual c++ have some profiler?
Zini wrote: Well, we are talking about the GPU here, right? Unless I am completely mistaking the MSVC profiler only handles the CPU.
gus wrote: I use AMD GPU profiler ;)

Good news: i *think* i found the problem. Removing the GUI really improved performances, and oddly (I don't see any reason...) it removed the "bug" (ie some objects taking a way too much time to render). I will see how I can improve performances of the GUI.

Also, the function executeLocalScript and globalScript take quiet some time. Would it be possible to execute them every 100 ms for example instead of doing it every frame?
Zini wrote: They must execute every frame or some more advanced plugins will break (Redemption will for sure). Actually, global scripts are throttled by MW when the framerate starts to drop, but not locals.
I would also like to add, that I haven't done anything about optimizing the script system yet, so there is still some room for improvement. But also, script execution is usually one of the most CPU-intense tasks in a game.

I don't think scripting performance improvements will be of much use here anyway, since the GPU is the bottleneck, not the CPU.
gus wrote:
I don't think scripting performance improvements will be of much use here anyway, since the GPU is the bottleneck, not the CPU.
Well, i commented the executeLocale/global script, and i got a rather good FPS boost,even if the app is GPU bound. But it's true that if we can only slightly increase perf, it's of no use.
Zini wrote: I wouldn't worry to much about that for now. But just to make sure, you are measuring while staying in the same cell, right? Currently OpenMW compiles new scripts on the fly when encountering them. That obviously is slow. We may offer pre-compiled scripts again, when we have an editor of our own.

Also, you could try to replace the function OMW:Engine::frameStarted with frameRenderingQueued (same signature). This might help with performance on the CPU side.
gus wrote: Just to give a little update: Ogre Static Geometry works best when every object has the same material. So I'm working on creating texture atlas on the fly,so that every object has the same material.
Zini wrote: For every object in the scene? Does that really make sense? MW is kinda dynamic. Objects can constantly be added or removed. Also, wouldn't such a solution cause problems, once we allow more complex materials with custom shaders?
gus wrote: It is only possible to batch geometry if objects have the same material. So 500 different materials -> 500 batches. That's why a texture atlas might help a lot.
MW is kinda dynamic. Objects can constantly be added or removed.
In exteriors scene, which are the most troublesome, most objects are static. So using batching for static objects only should already provide a good performance boost.
Also, wouldn't such a solution cause problems, once we allow more complex materials with custom shaders?
This is indeed a problem. But it's possible for nearly all objects to use the same shaders(you can pack in a single shader bump mapping, shadows, light and whatever you want, and pass parameters via a texture), and if there is only a low amount (like 5/6) of objects in a cell with a custom shader, it should not impact perf too much.
Zini wrote:
It is only possible to batch geometry if objects have the same material. So 500 different materials -> 500 batches.
Yes, I am aware of that. But is this number realistic? In an average MW cell, the same IDs are used often and not many cells have that many IDs in the first place. Same is probably true for most plugins and TCs.
Have you already made sure, that IDs that use the same textures only use one material?

Also, even if there are 500 batches. Shouldn't today's hardware be able to handle that at a decent framerate?
In exteriors scene, which are the most troublesome, most objects are static.
That is true for Morrowind.esm, but may not be true for plugins or TCs. You can't rely on this fact.


I am not totally opposed to the idea of a texture atlas, but do we really need it? What kind of performance improvements have you reached yet? Some actual numbers would be nice.
gus wrote:
Have you already made sure, that IDs that use the same textures only use one material?
Yup. In my code: one texture->one material.

I am not totally opposed to the idea of a texture atlas, but do we really need it? What kind of performance improvements have you reached yet? Some actual numbers would be nice.
I did some testing using only one texture for every object (which should be fairly close as using a texture atlas). Here are the result:
Beshara:
1 texture 400 FPS
"normal" OpenMW 100 FPS

Vivec:
1 texture: 150 FPS 30 batches
"normal" OpenMW: 40 FPS 500 batches
That is true for Morrowind.esm, but may not be true for plugins or TCs. You can't rely on this fact.
I thought that the directory in the BSA file could indicate if objects are static or not. Maybe i'm wrong.
We could also use Instancing which support dynamic objects, but i have not investigate it yet.

It also seems that there is another option, which is Texture Arrays, but I don't know if Ogre fully support it(it seems that right now, it's limited to OpenGL in Ogre), and if older graphics cards can use them.
Zini wrote:
I thought that the directory in the BSA file could indicate if objects are static or not.
Nope.
Beshara:
1 texture 400 FPS
"normal" OpenMW 100 FPS

Vivec:
1 texture: 150 FPS 30 batches
"normal" OpenMW: 40 FPS 500 batches
Thanks. That is what I was looking for. btw. could you also post your graphics hardware so we have a frame of reference?

It seems that the performance is not sufficient yet. Instancing is certainly worth investigating. OpenGL-only solutions are not an option though. If you have to, go ahead with your texture atlas, but make sure, that it can handle changes in the scene.
gus wrote:
Thanks. That is what I was looking for. btw. could you also post your graphics hardware so we have a frame of reference?
I have a ATI Mobility 5650. Note that it's a laptop graphic card, so it's probably a way less powerful than a "true" 5650.
Zini wrote: My knowledge about ATI cards isn't that firm, but from what I see this is a semi-recent card at the lower end of the mid-range category. In other words, if we go no lower than 100-120 fps on it for any place in Morrowind.esm on max settings we should have enough performance for now (the extra fps is to cover the upcoming terrain and water).
gus wrote: Also note that texture/geometry complexity seems to have few effect on performances, so FPS should remain the same even with more complex models.

I investigated instancing, and it only for "cloned" objects, so it doesn't seems suited for our needs.

I have found in OgreAddon an implementation of a texture Atlas which seems to do everything I want. I will just have to search which license it use.
gus wrote: Which texture size do you wants modders to be able to use? Because if every texture size is above 1024*1024, then texture atlas is pointless as if my memory is good, the max texture size is 2048*2048, meaning a texture atlas could only contain... 1 texture.

Which proportion of textures will be under 1024*1024? (not in the original morrowind of course)
Chris wrote:
gus wrote:Which texture size do you wants modders to be able to use? Because if every texture size is above 1024*1024, then texture atlas is pointless as if my memory is good, the max texture size is 2048*2048, meaning a texture atlas could only contain... 1 texture.
Any texture size should be possible, IMO. There are mods for Morrowind that add high-resolution textures, and I wouldn't be surprised to see some of them reach 1024x1024 or higher.

It doesn't seem as though texture atlasing would be very cache-friendly; ie, the ability to load the texture onto a card once, then let the driver decide when its okay to leave on the card and when it can be temporarilly stored in system RAM. How would it work for wrapping and mirrored texture modes? Filtering (particularly with mipmaps)? Different compression methods may also be a problem... some textures are stored as raw RGB(A), some are stored as DXT1, DXT3, or DXT5. Decompressing them all to RGBA would be very wasteful, while forcing to compress them as DXT* could hamper quality. Converting between DXT formats would also be bad, considering they're lossy formats.

Only my opinion, of course, but I don't think making everything one big mesh is going to work, given the above issues. And once we start adding features like physics, LOD, and material shaders, it's just going to make it worse. Translucency may also be an issue.
Zini wrote: A texture size of 1024 x 1024 isn't exactly uncommon. It should be handled properly.

I was sceptical about the whole texture alias thing from the beginning. Maybe we should try other methods first, before we get into this. You mentioned, that you are building batches on a per-cell basis. This potentially results in 9 times as many batches as you might get otherwise (neighbouring cells often use the same models, which means they also use the same materials/textures).
Star-Demon wrote: Can it wait for a bit? I'm in the middle of some optimizations.

Image
gus wrote:
Can it wait for a bit? I'm in the middle of some optimizations.
?

Zini/Chris: of course high res texture will be supported. But the question is more where there be high res texture for every object? If not, texture atlas is still interesting for these objects.
Only my opinion, of course, but I don't think making everything one big mesh is going to work, given the above issues. And once we start adding features like physics, LOD, and material shaders, it's just going to make it worse. Translucency may also be an issue.
What kills performances is not the complexity of the geometry/shader, it's only the number of bash. So IMO, objects should have more or less the same material. Or we don't render multiple cell.
It doesn't seem as though texture atlasing would be very cache-friendly; ie, the ability to load the texture onto a card once, then let the driver decide when its okay to leave on the card and when it can be temporarilly stored in system RAM. How would it work for wrapping and mirrored texture modes?
The problem is not memory or even raw power of the graphic card, it's just the number of call to the graphic card.
Different compression methods may also be a problem... some textures are stored as raw RGB(A), some are stored as DXT1, DXT3, or DXT5. Decompressing them all to RGBA would be very wasteful, while forcing to compress them as DXT* could hamper quality. Converting between DXT formats would also be bad, considering they're lossy formats.
They are decompressed by Ogre anyway.
Filtering (particularly with mipmaps)
There are problems with mipmap (some bleeding sometimes) but it doesn't seem big.
Zini wrote:
Zini/Chris: of course high res texture will be supported. But the question is more where there be high res texture for every object? If not, texture atlas is still interesting for these objects.
For every object, probably not. Very common, yes. I did a quick survey for Redemption and it seems that the most common sizes are 512x512 and 1024x1024. Considering that we started Redemption over half a decade ago and the hardware was a lot weaker back then, it would not be surprising to see 1024x1024 as the dominant size on a new project started these days.
Chris wrote:
gus wrote:Zini/Chris: of course high res texture will be supported. But the question is more where there be high res texture for every object? If not, texture atlas is still interesting for these objects.
Don't know the actual sizes, but this seems pretty high-res all over the place.
What kills performances is not the complexity of the geometry/shader, it's only the number of bash. So IMO, objects should have more or less the same material. Or we don't render multiple cell.
Pixel shaders can have a huge performance impact. Best case is, it needs to run width*height times per frame. With AA, that number increases exponentially. Plus, some overdraw is unavoidable even in well-optimized cases. You're looking at tens to hundres of millions times per second. You'll want to keep materials split up so they don't because complex per-pixel operations.
The problem is not memory or even raw power of the graphic card, it's just the number of call to the graphic card.
Memory can be a problem. I can't even run standard Oblivion at max graphical settings without topping 256MB of VRAM, which then causes my performance to tank as its constantly swapping to system RAM. Considering Oblivion's texture sizes are considered low-res, that doesn't bode well for more modern mods.
They are decompressed by Ogre anyway.
I'd be surprised. One of the benefits of DXT compression is that it can be loaded as-is onto the card, and the card itself can use it in compressed form (it decompresses on the fly as a triangle is rasterized; because of the way the format works, pushing around that many uncompressed texels puts a bigger strain on the card than real-time decompression). I'd hope Ogre doesn't even keep a copy around for itself, since it doesn't need it for anything.

DXT5 compression works out to about 1 byte per pixel. DXT1 is half that. A lot of VRAM is going to be wasted if you store them uncompressed, or quality is going to be compromised if you (re)compress them to DXT.


I think at this point in time, optimization isn't needed, yet. Especially in a way as drastic as this. IMO, a better idea is to make sure Ogre is being used properly. To make sure it's being used efficiently, before making a change as fundamental as how the world is presented to Ogre.
gus wrote: Ok. So it seems texture atlas is out.

I had a look at Oblivion, and it seems to me that the rendering distance of objects is far less than the one we have in OpenMW, except for trees. So i propose we decrease the rendering distance of objects.

Also, to do better batching, it would be good to have the position and the size of cells.
Zini wrote:
I had a look at Oblivion, and it seems to me that the rendering distance of objects is far less than the one we have in OpenMW, except for trees. So i propose we decrease the rendering distance of objects.
Actually, I think this should be adjusted automatically based on the framerate.
Also, to do better batching, it would be good to have the position and the size of cells.
I don't understand what you mean by this.
Zini wrote: Okay, now I understand what you want (though not for what purpose).

An exterior cell is 8192 by 8192 units large. Cell 0, 0 starts at 0, 0.
gus wrote: Let's be realistic, my exams are draining all my energy, so I'm not really working on OpenMW anymore. This is probably going to last until the beginning of June.
gus wrote: I noticed something in exteriors cells: each time you change cell, performances decrease a little. And if you keep changing cell, it decrease a lot. So it seems that it's not cleaned up properly. I don't know which part (bullet, ogre, other) is faulty yet.

Also, it is very important to know which objects are static in a scene. I had a look at the BSA, and for example, it seems that things that are in the /f directory can't be moved (trees, plants,etc). What for the others directories? (and is it even true?)
That's a very important point, because you can only batch static objects(well, you can batch other objects, but you have to rebuild the static geometry every time they move, which take about 2s on my computer), and i'm almost sure that the high number of batch is the reason FPS is so bad on exteriors cells.
Zini wrote:
I noticed something in exteriors cells: each time you change cell, performances decrease a little. And if you keep changing cell, it decrease a lot. So it seems that it's not cleaned up properly. I don't know which part (bullet, ogre, other) is faulty yet.
I noticed that too. And it's not only exteriors cells.
Also, it is very important to know which objects are static in a scene. I had a look at the BSA, and for example, it seems that things that are in the /f directory can't be moved (trees, plants,etc).
That may be true for Morrowind.esm and the matching files, but certainly not for mods and TCs. You can't use the directory structure to determine static-ness. References of the "static" category seem to be static, but optimising based one that alone won't do you much good either. More advanced mods will use plenty of activators instead of statics to provide more world-interactivity (Redemption does for sure and original MW handles that gracefully).
Maybe we should forgot about static-ness and use potential static-ness instead, i.e. develop a way to determine which references most likely won't change (or at least not very often), but are not guaranteed to not change.
gus wrote:
That may be true for Morrowind.esm and the matching files, but certainly not for mods and TCs. You can't use the directory structure to determine static-ness. References of the "static" category seem to be static, but optimising based one that alone won't do you much good either. More advanced mods will use plenty of activators instead of statics to provide more world-interactivity (Redemption does for sure and original MW handles that gracefully).
Maybe we should forgot about static-ness and use potential static-ness instead, i.e. develop a way to determine which references most likely won't change (or at least not very often), but are not guaranteed to not change.
I was thinking of something like that. The problem is that i can't do it, as I have no experience with modding.
But i can do a clean interface, which would be filled by others (it wouldn't require high C++ skills).

That's a post 1.0 feature, but it would be better if there were some flag that modders could use to tell openmw if an object is likely to move or not.


PS: --showfps doesn't work anymore. Is it a bug? (it doesn't really matter, as i display it on the console, but anyway, it looks like a bug).
Zini wrote:
PS: --showfps doesn't work anymore. Is it a bug? (it doesn't really matter, as i display it on the console, but anyway, it looks like a bug).
Renamed it to --fps; for consistency and shortness.
I was thinking of something like that. The problem is that i can't do it, as I have no experience with modding.
But i can do a clean interface, which would be filled by others (it wouldn't require high C++ skills).

That's a post 1.0 feature, but it would be better if there were some flag that modders could use to tell openmw if an object is likely to move or not.
I think it is better to determine this automatically. Need to think about it more, but:

- NPCs and creatures probably move around a lot.
- Statics never move.
- everything else probably does not move, unless it does move. For a start we could assume that objects don't move and when they move we set a flag, that marks them as non-static (if an object moves once, it is likely to move again some time in the future).
- Also, at a more advanced stage, we could analyse scripts attached to objects, if they contain any instructions to move the object.
gus wrote:
Statics never move.
How do i know if an object is static?
everything else probably does not move, unless it does move. For a start we could assume that objects don't move and when they move we set a flag, that marks them as non-static (if an object moves once, it is likely to move again some time in the future).
We could do that. But destroying/recreating a static geometry takes some time.

Also, when an object is in a staticGeometry, it isn't attached to a sceneNode anymore. So when an object needs to be moved, it should ask the exteriorRender the sceneNode of the object (if the object is inside a staticGeometry, the render would remove it from the static geometry and recreate a sceneNode).
Star-Demon wrote: I think you'd have to get the kind of object it is, or else you can find a way to see if it's an instance of a type of object. I'm not sure how fast that is compared to having a flag on every reference in the engine. (process a thousand booleans or check instances of types for 1000 references?)

I'm not sure if it's worth separating MWclass objects by behavior with another class or interface. (Forgive the java expression)
Zini wrote:
How do i know if an object is static?
It would be easy to add a flag parameter to the Renderer function (insertBegin).
Also, when an object is in a staticGeometry, it isn't attached to a sceneNode anymore. So when an object needs to be moved, it should ask the exteriorRender the sceneNode of the object (if the object is inside a staticGeometry, the render would remove it from the static geometry and recreate a sceneNode).
It would be best to not manipulate SceneNodes from outside of the renderer. We should eventually add a function with this kind of signature:

void moveTo (const std::string& handle, const Ogre::Vector3& position);

The renderer can then do any needed housekeeping without bothering the calling functions.
gus wrote:
It would be easy to add a flag parameter to the Renderer function (insertBegin).
I'm not sure I understand. Is there some kind of flag in the ESM or the NIF file(or other) which tell if the object is static? Or would it be a flag generated at runtime?
It would be best to not manipulate SceneNodes from outside of the renderer. We should eventually add a function with this kind of signature:
That's right.
I'm not sure if it's worth separating MWclass objects by behavior with another class or interface. (Forgive the java expression)
I don't know. To be honest, I don't understand the general design of OpenMW to fully understand what you said.
Zini wrote:
I'm not sure I understand. Is there some kind of flag in the ESM or the NIF file(or other) which tell if the object is static? Or would it be a flag generated at runtime?
"Static" is a distinct type of ID.

http://www.uesp.net/text.shtml?morrow/tech/mw_esm.txt

(scroll down to entry 13: STAT)
gus wrote: I can't find by myself where the static flag is stored in OpenMW, so if you could do this
It would be easy to add a flag parameter to the Renderer function (insertBegin).
it would save a lot of time.
Zini wrote: Again: There is no static flag. MW has distinct types of IDs: Weapons, Activators, Misc and such. One of these types is "Static".

But okay, I'll handle it.
Zini wrote: Done.

btw. your latest modifications make OpenMW crash here.
gus wrote: Strange. It doesn't crash on my computer. I will have a look at it.
Zini wrote: Okay. I see the problem. You are using a random number generator for making OGRE handles. Don't do that! If you need a unique name, add a static int variable to the class and increment it each time by one. Add a prefix specific to this class and you have your unique handle.
gus wrote: OOps i forgot that :oops: I planned to remove it, but i completely forgot about it. (but you are unlucky, it didn't even crashed once on my computer^^)
gus wrote: It provided a decent performance boost: in Vivec I had 15-25 FPS, I've now 35-45 FPS. Is it enough?
Zini wrote: Its a start and we may stop here and continue with rendering optimisations at a later point, when we have the terrain and stuff like object movement.

But it isn't working properly. I got a seg fault. I guess at switching cells.
gus wrote: It doesn't surprise me. In fact, it's surprising I don't get any segfault. I was searching the cause of the slow down when changing cell, and the cell renderer didn't cleaned up things well: it only destroyed sceneNode, but not movableObjects. So i did a function to destroy movableObjects also, but I forgot that Light are also attached to sceneNode.

Edit: never mind, Ogre::Light is also a MovableObject. So the problem is not there.
Zini wrote: The situation is a bit more complex. That function is part of an older design by Nico, that is not very well integrated into our current architecture. When this function is called, there shouldn't be any scene nodes or movable objects left. Currently the job of this function is to remove the cell-base-node; nothing more.
Zini wrote: Or not. Apparently the MWScene::removeObject function only does the physics part. *sigh* The whole MWRender-namespace needs a cleanup. It still mixes parts of Nico's original concept and of my newer design.
Zini wrote: Here is the stack backtrace:

Code: Select all

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff79926f9 in std::_Rb_tree<std::string, std::pair<std::string const, Ogre::StaticGeometry*>, std::_Select1st<std::pair<std::string const, Ogre::StaticGeometry*> >, std::less<std::string>, Ogre::STLAllocator<std::pair<std::string const, Ogre::StaticGeometry*>, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::_M_lower_bound (this=0x7ffff7ed7008, name=...) at /usr/include/c++/4.4/bits/stl_tree.h:985
985         while (__x != 0)
(gdb) bt
#0  0x00007ffff79926f9 in std::_Rb_tree<std::string, std::pair<std::string const, Ogre::StaticGeometry*>, std::_Select1st<std::pair<std::string const, Ogre::StaticGeometry*> >, std::less<std::string>, Ogre::STLAllocator<std::pair<std::string const, Ogre::StaticGeometry*>, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::_M_lower_bound (this=0x7ffff7ed7008, name=...) at /usr/include/c++/4.4/bits/stl_tree.h:985
#1  std::_Rb_tree<std::string, std::pair<std::string const, Ogre::StaticGeometry*>, std::_Select1st<std::pair<std::string const, Ogre::StaticGeometry*> >, std::less<std::string>, Ogre::STLAllocator<std::pair<std::string const, Ogre::StaticGeometry*>, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::find (this=0x7ffff7ed7008, 
    name=...) at /usr/include/c++/4.4/bits/stl_tree.h:1421
#2  std::map<std::string, Ogre::StaticGeometry*, std::less<std::string>, Ogre::STLAllocator<std::pair<std::string const, Ogre::StaticGeometry*>, Ogre::CategorisedAllocPolicy<(Ogre::MemoryCategory)0> > >::find (this=0x7ffff7ed7008, name=...)
    at /usr/include/c++/4.4/bits/stl_map.h:659
#3  Ogre::SceneManager::destroyStaticGeometry (this=0x7ffff7ed7008, name=...)
    at /home/marc/Tools/ogre-1-7-1/OgreMain/src/OgreSceneManager.cpp:6297
#4  0x000000000083395d in MWRender::ExteriorCellRender::destroy() ()
#5  0x0000000000833d32 in MWRender::ExteriorCellRender::~ExteriorCellRender() ()
#6  0x00000000008ced97 in MWWorld::World::unloadCell(std::_Rb_tree_iterator<std::pair<ESMS::CellStore<MWWorld::RefData>* const, MWRender::CellRender*> >) ()
#7  0x00000000008cf1be in MWWorld::World::changeCell(int, int, ESM::Position const&, bool) ()
#8  0x00000000008d2f3f in MWWorld::World::moveObject(MWWorld::Ptr, float, float, floa---Type <return> to continue, or q <return> to quit---
t) ()
#9  0x0000000000825320 in MWRender::MWScene::doPhysics(float, MWWorld::World&, std::vector<std::pair<std::string, Ogre::Vector3>, std::allocator<std::pair<std::string, Ogre::Vector3> > > const&) ()
#10 0x00000000008d3167 in MWWorld::World::doPhysics(std::vector<std::pair<std::string, Ogre::Vector3>, std::allocator<std::pair<std::string, Ogre::Vector3> > > const&, float) ()
#11 0x000000000081a34e in OMW::Engine::frameRenderingQueued(Ogre::FrameEvent const&)
    ()
#12 0x00007ffff797c265 in Ogre::Root::_fireFrameRenderingQueued (
    this=<value optimised out>, evt=<value optimised out>)
    at /home/marc/Tools/ogre-1-7-1/OgreMain/src/OgreRoot.cpp:829
#13 0x00007ffff797d1f9 in Ogre::Root::_fireFrameRenderingQueued (
    this=0x7ffff7ec08d8)
    at /home/marc/Tools/ogre-1-7-1/OgreMain/src/OgreRoot.cpp:884
#14 0x00007ffff797d232 in Ogre::Root::_updateAllRenderTargets (this=0x7ffff7ec08d8)
    at /home/marc/Tools/ogre-1-7-1/OgreMain/src/OgreRoot.cpp:1376
#15 0x00007ffff797d345 in Ogre::Root::renderOneFrame (this=0x7ffff7ec08d8)
    at /home/marc/Tools/ogre-1-7-1/OgreMain/src/OgreRoot.cpp:966
#16 0x00007ffff797d39d in Ogre::Root::startRendering (this=0x7ffff7ec08d8)
    at /home/marc/Tools/ogre-1-7-1/OgreMain/src/OgreRoot.cpp:956
#17 0x00000000007e9dd3 in OEngine::Render::OgreRenderer::start() ()
#18 0x000000000081cafb in OMW::Engine::go() ()
---Type <return> to continue, or q <return> to quit---
#19 0x0000000000810675 in main ()


gus wrote: Every time i try to post you post a new post^^
I will have a look at the stack trace.

Well, the function ExteriorCellRender::deleteObject doens't destroy movable objects too. I will fix this, when the debug build will be finished (this can take some time).

Also, function like deleteObject won't work with staticObject(they won't crash, but they will do nothing). Is it a problem?
gus wrote: Strange. It seems that Ogre crash when it tries to delete the staticGeometry.

PS: is there a way to remove the music? Or at least lower the volume?
Zini wrote: Somewhat. Right now it should work. But later we may also add potentially static objects to the static geometry. And these can be deleted. Also, for the sake of consistency we probably should allow deleting fully static objects anyway (with a performance penalty).
Zini wrote:
gus wrote: PS: is there a way to remove the music? Or at least lower the volume?
Use --nosound

And have a look at ./openmw --help
Zini wrote: Found the source of the crash. The problem was twofold:

1. The new cleanup-code was somewhat flaky. When releasing a dynamically allocated resource outside of the destructor, you should always assign 0 to the pointer pointing to this resource.

@gus: Please have a look at my new code. What you did to acquire the scene manager was also problematic. It currently works, but if there is ever the need to use a second scene manager, your code would have made OpenMW blow up.

2. There was a redundant call to the destroy function (more artefacts from the old design) in the unloadCell function, which in turn made the less-than-robust cleanup-code crash.

I would like to ask developers (and user capable of building OpenMW) to give the new code a test run (use my optimisation branch). I think it should work, but we made some pretty invasive changes and some more testing would be good. Also I would like to know what kind of performance improvements people get compared to 0.10.0.

Assuming the results are satisfactory I will call the first round of performance optimisation closed.
We will have to put more work into it, but most of the remaining optimisation options are pretty complex and are better handled at a later time when OpenMW is more complete and we have finished some of the outstanding code maintenance tasks.
Star-Demon wrote: I'll update things and give it a shot - if anything interesting comes up I'll post.
gus wrote:
1. The new cleanup-code was somewhat flaky. When releasing a dynamically allocated resource outside of the destructor, you should always assign 0 to the pointer pointing to this resource.

@gus: Please have a look at my new code. What you did to acquire the scene manager was also problematic. It currently works, but if there is ever the need to use a second scene manager, your code would have made OpenMW blow up.
I will have a look.
2. There was a redundant call to the destroy function (more artefacts from the old design) in the unloadCell function, which in turn made the less-than-robust cleanup-code crash.
Looking forward to the redesign of the renderer ^^
Assuming the results are satisfactory I will call the first round of performance optimisation closed.
We will have to put more work into it, but most of the remaining optimisation options are pretty complex and are better handled at a later time when OpenMW is more complete and we have finished some of the outstanding code maintenance tasks.
What do you want me to do next? (as i will have to travel for my exams, i will probably disappear for 2 or 3 weeks, so don't give me a blocking task).
Zini wrote: Well, there is still this issue. http://bugs.openmw.org/issues/131

Would be very helpful to get it out of the way. Afterwards, pick what you want. The GUI-tasks are most important, but if you have no experience with MyGUI, something else might be a better choice.

Edit: btw. I am still waiting for feedback regarding the effectiveness of our optimisations (this goes to all developers).
Zini wrote: Less feedback than I hoped (exactly zero). But since no one has yelled at me yet, I assume the branch is good. I will merge it now.
pvdk wrote: Will try it this evening, was busy with the launcher.

Update:
Ok, I don't know if it's because of the optimisation or my configuration but exteriors are reaaaally slow right now. With nvidia drivers, desktop effects disabled I get around 0/5 fps in windowed mode, 640x480. I'm too afraid to try full-screen :P
best regards,
Lukasz

Locked