Raspberry Pi 4 Performance

Support for running, installing or compiling OpenMW

Before you submit a bug report for the first time, please read: Bug reporting guidelines
Mishtal
Posts: 26
Joined: 13 Dec 2016, 06:45

Raspberry Pi 4 Performance

Post by Mishtal » 18 Sep 2020, 06:32

Hi folks,

I'm trying to get OpenMW 0.46 working on a raspberry pi 4.

I'm using this OS image: https://github.com/sakaki-/gentoo-on-rpi-64bit

I've modified /boot/config.txt to have
gpu_mem=256
and have kept the default
dtoverlay=vc4-fkms-v3d
I've confirmed that I'm using the v3d Mesa driver. I've also confirmed that glxgears can push >200 FPS at full screen, or a solid 60 at full screen when vsync is enabled.

However, with a fully vanilla Morrowind, I'm getting between 3 and 5 FPS when running at 720p.

For sake of comparison, I tried
dtoverlay=vc4-kms-v3d-pi4
, but that overlay appears unstable, with full-black-screen happening for a few seconds any time an OpenMW menu is opened (E.g. the name dialog in the imperial ship), and massive stuttering when running at 1080p.

I"ve tried the following package combinations
games-engines/openmw-0.46.0 qt5 osg-fork
dev-games/openscenegraph-openmw-3.4
games-engines/openmw-0.46.0 qt5 -osg-fork
dev-games/openscenegraph-3.6 egl
games-engines/openmw-0.46.0 qt5 -osg-fork
dev-games/openscenegraph-3.6 -egl
Without any obvious change in performance.

I'll be recompiling openscenegraph-3.6 with different combinations of xrandr, sdl2, gstreamer, asio as I'm able to (Slow to compile on the pi4, obviously).

I've seen people referencing the environment variables OPENMW_PHYSICS_FPS and OPENMW_DECOMPRESS_TEXTURES from https://wiki.openmw.org/index.php?title ... _Variables , but neither of them seem to have any change in performance. Notably, with 0.46.0, the DECOMPRESS_TEXTURES variable doesn't appear to be needed for proper rendering. So either I've not yet encountered the textures that need it, or the underlying problem has been fixed in the newer version.

Am I missing anything obvious that would help with frame-rate?

pipbug
Posts: 1
Joined: 23 Sep 2020, 22:00

Re: Raspberry Pi 4 Performance

Post by pipbug » 23 Sep 2020, 22:03

Have you ran the genup command?

Mishtal
Posts: 26
Joined: 13 Dec 2016, 06:45

Re: Raspberry Pi 4 Performance

Post by Mishtal » 24 Sep 2020, 07:50

pipbug wrote:
23 Sep 2020, 22:03
Have you ran the genup command?
Yes, certainly. The genup command just installs system updates. It runs on a weekly timer by default. I've been using this image for months, and have many gentoo systems.

Right now I'm just trying to figure out how to get anywhere close to the FPS that others have reported on older models of raspberry pis.

I saw @psi29a claiming that they had about 20fps in outdoor areas on a raspberry pi 2, which has a somewhat weaker GPU. viewtopic.php?f=8&t=6837&p=67134&hilit=raspberry#p67262 in another post, they mentioned 15fps at 1080p viewtopic.php?f=20&p=35187#p35080

@salvalie reports up to 55fps on an rpi3 viewtopic.php?f=47&t=5742&p=61574&hilit ... rry#p61535

@mechanizeddeath claimed to get as high as 30FPS in some interior cells on an RPi 2 viewtopic.php?f=8&t=2850&hilit=raspberr ... =60#p37535

I assumed that since the rpi4 has a somewhat more powerful GPU than the previous generations, i'd be able to get above 10 FPS.

Mishtal
Posts: 26
Joined: 13 Dec 2016, 06:45

Re: Raspberry Pi 4 Performance

Post by Mishtal » 27 Sep 2020, 01:17

Well, I've got some minor progress.

I've recompiled the openmw, openscenegraph, sdl2, and bullet packages with these CFLAGS

CFLAGS="-march=armv8-a+crc+simd -mtune=cortex-a72 -ftree-vectorize -O2 -pipe -fomit-frame-pointer"
CXXFLAGS="$CFLAGS"

and then launched OpenMW with the following settings.
  • OPENMW_PHYSICS_FPS=10
  • Turned off Shadows entirely
  • Changed the resolution to 640x480
Now I get about 10 FPS walking around Seyda Neen.

Not really good enough to play the game, since things are still pretty jerky, and there's a lot of lag-spikes based on what I'm looking at, but it's progress.

I notice that if I don't set the physics to 10, even staring at the sky on the deck of the imperial ship, I'm getting physics at over 100, in the f3 profiler overlay, and my FPS drops down to 3-4, even with the changes listed above.

Can anyone recommend rpi4 specific cflags that would help with performance, or mods (e.g. https://modding-openmw.com/mods/morrowi ... ion-patch/) that are known to make the game friendlier with the OpenMW physics system, or with rendering on low power GPUs

With the following mods installed, with the physics FPS limited to 10, I can get up to 20fps outside in Seyda Neen, but the FPS drops *drastically*, like down to 3-5fps every now and then while walking around.
If I unlimit the physics FPS with those mods installed, instead of the "phys" count in the F3 profile overlay staying around 10, it goes up to 60-70, and the actual FPS goes down to 6-7 walking around Seyda Neen. This is still better without those mods installed, but it's totally unplayable.

Re-enabling shadows still gives me 3-4 FPS (With the unlocked physics FPS).

And of course, all of these measurements are still at the resolution of 640x480

Mishtal
Posts: 26
Joined: 13 Dec 2016, 06:45

Re: Raspberry Pi 4 Performance

Post by Mishtal » 28 Sep 2020, 00:11

I added
-fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -ffat-lto-objects -fuse-linker-plugin
To the CFlags, and observed that they have no obvious effect on the frame rate either way.

Mishtal
Posts: 26
Joined: 13 Dec 2016, 06:45

Re: Raspberry Pi 4 Performance

Post by Mishtal » 28 Sep 2020, 21:49

Ok, well, I'm not able to move forward on this, as I've exhausted avenues of investigation that don't require becoming an OpenMW developer.

I would appreciate some help, if anyone's willing to chime in.

User avatar
psi29a
Posts: 4932
Joined: 29 Sep 2011, 10:13
Location: Belgium
Gitlab profile: https://gitlab.com/psi29a/
Contact:

Re: Raspberry Pi 4 Performance

Post by psi29a » 29 Sep 2020, 06:10

Keep in mind, when I was playing on my RPi2 we didn't have distant terrain, object paging or shadows. It was plain old defaults (no reflection,refraction, water shader) with nothing maxed except for full screen 1080p and no mods. It was also compiled at 32-bits.

Note: this was the time when s3tc was an issue, so I had extracted all the images and uncompressed them so that I wouldn't get any pink textures. Now that we automatically (via OSG) decompress s3tc on the fly, this could be an additional hit on perf on a CPU/GPU limited setup.

Also to keep in mind is that the VC4 driver (as it was known then, now v3d?) was purely for the VC4... so in theory, you should get even more perf out of a RPi3.

RPi4 changes things up with VC5 chipset with a driver that isn't as mature as with the VC4 and is 64-bit (I think?). So some variables have changed.

Mishtal
Posts: 26
Joined: 13 Dec 2016, 06:45

Re: Raspberry Pi 4 Performance

Post by Mishtal » 29 Sep 2020, 19:51

@psi29a thanks so much for the response!
psi29a wrote:
29 Sep 2020, 06:10
Keep in mind, when I was playing on my RPi2 we didn't have distant terrain, object paging or shadows. It was plain old defaults (no reflection,refraction, water shader) with nothing maxed except for full screen 1080p and no mods. It was also compiled at 32-bits.
Good point.

I did notice a substantial difference in framerate when I disabled shadows. I'll see what happens if I minimize water shader settings, and reduce distant terrain.

I'm not sure that the 32 bit compilation would have a positive impact on the runtime. 64bit does have larger pointer sizes, but I would expect the additional registers to outweigh that.
psi29a wrote:
29 Sep 2020, 06:10
Note: this was the time when s3tc was an issue, so I had extracted all the images and uncompressed them so that I wouldn't get any pink textures. Now that we automatically (via OSG) decompress s3tc on the fly, this could be an additional hit on perf on a CPU/GPU limited setup.
Another good point. Is there an option to force OSG to pre-decompress? Or would I be stuck doing it by hand?

Any profiling hints on this?

psi29a wrote:
29 Sep 2020, 06:10
Also to keep in mind is that the VC4 driver (as it was known then, now v3d?) was purely for the VC4... so in theory, you should get even more perf out of a RPi3.

RPi4 changes things up with VC5 chipset with a driver that isn't as mature as with the VC4 and is 64-bit (I think?). So some variables have changed.
VC4 driver is still called vc4. The v3d driver is for the Rpi4's video core (VC 6 i thought?)

I could see how the v3d driver might be less optimized than the vc4 driver. That's worth following up on.

----------------------------------------------------------------------

Is there anyone in the OpenMW project who would be interested in having an Rpi4 donated to help with development? I'm well aware that there can't be any expectation that the person the Rpi4 is donated to do any specific quantity of work, or guarantee any outcome. That being said, I would be willing to send someone a new-in-box Rpi4 if they would be interested in using it as a development / test platform for OpenMW going forward.

User avatar
AnyOldName3
Posts: 1977
Joined: 26 Nov 2015, 03:25

Re: Raspberry Pi 4 Performance

Post by AnyOldName3 » 30 Sep 2020, 02:03

OSG decompresses the textures as soon as the game says it needs them. The problem is that it doesn't necessarily say it needs them until the moment it genuinely needs them immediately. There's not any good way to predict that, so the only other option is to extract the BSA and use a tool to decompress all the textures ahead of time.
AnyOldName3, Master of Shadows

Mishtal
Posts: 26
Joined: 13 Dec 2016, 06:45

Re: Raspberry Pi 4 Performance

Post by Mishtal » 30 Sep 2020, 17:16

AnyOldName3 wrote:
30 Sep 2020, 02:03
OSG decompresses the textures as soon as the game says it needs them. The problem is that it doesn't necessarily say it needs them until the moment it genuinely needs them immediately. There's not any good way to predict that, so the only other option is to extract the BSA and use a tool to decompress all the textures ahead of time.
Is there any practical way that I can measure this decompression latency? That would help me understand if it's a major cause of the low FPS, or if it's just a minor issue.

Post Reply