Compiler flags on i686

Support for running, installing or compiling OpenMW

Before you submit a bug report for the first time, please read: Bug reporting guidelines
Locked
thegriglat
Posts: 25
Joined: 12 Jan 2015, 16:22

Compiler flags on i686

Post by thegriglat »

Hi developers/gamers!

I'm playing in OpenMW on Linux (CentOS 6.4 i686) since openmw-0.28 and have tried a lot of combination of compiler flags to speed up OpenMW on my slow notebook.

What compiler flags are you using for build OpenMW? What flags are important -- e.g. for loop/stack/math optimization?

At the moment I use the following flags:

Code: Select all

-Ofast -ffast-math -mtune=native -march=native -pipe -m3dnow -mmmx -msse -msse2 -msse3 -mssse3 -msse4a -mfpmath=sse,387 -ftree-vectorize -ftree-loop-distribution
Ogre 1.9.0 is built with the same flags.

Usually I have 6-9 fps in Morrowind cities/big locations and 14-28 fps in small locations (rooms, caves). :(
P.S.: it seems on my system OpenMW uses only one CPU. Is it normal or I miss threading support?

I have two procs (2-in-1) like this:

Code: Select all

vendor_id	: AuthenticAMD
cpu family	: 20
model		: 1
model name	: AMD E-350 Processor
stepping	: 0
cpu MHz		: 1595.961
cache size	: 512 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 6
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc nonstop_tsc extd_apicid aperfmperf pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit wdt npt lbrv svm_lock nrip_save pausefilter
bogomips	: 3191.92
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate
Video card:

Code: Select all

00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Wrestler [Radeon HD 6310] (prog-if 00 [VGA controller])
	Subsystem: Lenovo Device 397b
	Flags: bus master, fast devsel, latency 0, IRQ 24
	Memory at c0000000 (32-bit, prefetchable) [size=256M]
	I/O ports at 3000 [size=256]
	Memory at d0200000 (32-bit, non-prefetchable) [size=256K]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [50] Power Management version 3
	Capabilities: [58] Express Root Complex Integrated Endpoint, MSI 00
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Kernel driver in use: radeon
	Kernel modules: radeon
Ascent
Posts: 39
Joined: 28 Jun 2014, 04:32

Re: Compiler flags on i686

Post by Ascent »

AMD E-350 Processor
You'd be better off waiting until the OSG port is done, I think. Improved 3D performance is one of the benefits that's supposed to offer. With the OGRE version I don't get reasonable performance out of the two quad core AMD APU systems I've built OpenMW for (A6-3400M w/ Radeon 6520G and A10-7300 w/ Radeon R6, FOSS Radeon driver).

If you want to try different flag combos anyway, here are the g++ 4.9 flags I use for OpenMW only. I've not built OGRE myself:

Code: Select all

-march=native -O3 -fgcse-after-reload -fgcse-las -fgcse-sm -fivopts -ftracer -fopenmp -floop-parallelize-all -ftree-parallelize-loops=4 -ffast-math
Suggestions:
Use -fopenmp -floop-parallelize-all -ftree-parallelize-loops=2 for dual core
Drop -mfpmath=sse,387 and compare performance (that flag can actually make it worse)

Edit: You might need -lgomp at link time for OpenMP. Use this before running cmake:
export LDFLAGS="-Wl,-O1 -Wl,--sort-common -lgomp"
thegriglat
Posts: 25
Joined: 12 Jan 2015, 16:22

Re: Compiler flags on i686

Post by thegriglat »

Hi Ascent,

Thanks!
Speed has been increased no so much as I hoped but I'm glad. :)
As minumum now I have 20-28 fps in rooms and 8-9 in openair locations (OpenMW uses 2 cpu).
Locked