File Format

Everything about development and the OpenMW source code.
Post Reply
Chris
Posts: 1626
Joined: 04 Sep 2011, 08:33

Re: File Format

Post by Chris »

Zini wrote:That is true. We will have a plugin system for that in the editor, but obviously we can not be sure, that at some point people want to have additional tools or even a fork. But even then, they probably would be still re-using our code base, which already contains all the load/save code.
I don't think a plugin system will be good enough for all issues (you really can't put the Qt parts in a plugin, for someone else to create a GTK or wxWidgets one, for instance).

I also don't like the idea of other projects needing to duplicate our code for their 3rd party tools, because that means if we update our code (or if they spot a bug and fix their code without telling us), the two will be out of sync. I suppose this could be solved by having an external/shared lib that acts as a abstraction layer, so loading records, modifying them, saving them, etc, is all handled by the one lib with a rock-solid API.
Yes, that would be the sensible thing to do. Unfortunately it would also break compatibility with MW and stops many mods from working.
How so? If a mod comes with a BSA, it's generally assumed it will be "registered" and used by the game. Is there a case where auto-loading a BSA from a matching ESM/ESP name will break something? Sure, the name matching wouldn't handle every case (for that we could use the ini like vanilla), but going forward with new mods, they could match the name to automatically bypass the need for ini editing.
This statement I disagree with. The ESX format is good enough for our purpose. It may require some tweaking, but there is no need to replace it.
Yeah, like I said there's still plenty we can do with the current ESM/P format by simply adding new records, or somehow "tagging" record data to specify if it's in a new format or not. I've been talking about you wanting to make a new format, which I think should be clearer and easier to handle instead of replicating the ESM structure.
That is not an option. Supporting the original VM is totally pointless (it sucks and doing so would result in a large amount of work, that will give an inferior engine compared to what we have already).
I don't think we need to support the original VM, but it may be a good idea to make a converter that takes in the original's bytecode and creates something our VM uses (kind of like what wined3d does for the d3d shader bytecode, converting it to glsl). We may need to anyway, if there's cases where the bytecode is out of sync with the text version.
User avatar
Zini
Posts: 5538
Joined: 06 Aug 2011, 15:16

Re: File Format

Post by Zini »

I don't think a plugin system will be good enough for all issues (you really can't put the Qt parts in a plugin, for someone else to create a GTK or wxWidgets one, for instance).
As I wrote it is possible, that there are situations where the plugin system will not be sufficient. Obviously a plugin that enhanced the UI would have to use Qt, but I do not see why that would be a problem.
I also don't like the idea of other projects needing to duplicate our code for their 3rd party tools, because that means if we update our code (or if they spot a bug and fix their code without telling us), the two will be out of sync. I suppose this could be solved by having an external/shared lib that acts as a abstraction layer, so loading records, modifying them, saving them, etc, is all handled by the one lib with a rock-solid API.
Components subsystems? We are already at that stage, at least for individual records.
How so? If a mod comes with a BSA, it's generally assumed it will be "registered" and used by the game. Is there a case where auto-loading a BSA from a matching ESM/ESP name will break something? Sure, the name matching wouldn't handle every case (for that we could use the ini like vanilla), but going forward with new mods, they could match the name to automatically bypass the need for ini editing.
As far as I know MW loads all bsa available, independently from the used ESX stack. I would be glad, if someone could prove me wrong about that, because it would make our life a lot easier, but I do not think this will happen.
Yeah, like I said there's still plenty we can do with the current ESM/P format by simply adding new records, or somehow "tagging" record data to specify if it's in a new format or not. I've been talking about you wanting to make a new format, which I think should be clearer and easier to handle instead of replicating the ESM structure.
Seems you have completely misunderstood me here. When I said new format, I meant exactly what you describe above. There are only three very minor changes that go beyond it:

- We replace the header, because the old header is extremely messy and full of junk. In particular floating point numbers for version information just suck too hard and I feel an instinctive need to eradicate all traces of this abomination. I guess we could fix up the original header record instead by adding an alternative version sub-record, if you think that is preferable.

- We give it a new extension, so that MW not accidental loads such a file and chokes on it with a CTD or something similar cryptic.

- We declare the bytecode section of scripts deprecated.

The rest will just be adding new but optional records and sub-records. Formally it is a new format. From MW's point of view an omwbase file (if we would somehow force MW to accept it) would not register as an updated version of ESX that is just too new for MW, but as an incompatible/different format.
I don't think we need to support the original VM, but it may be a good idea to make a converter that takes in the original's bytecode and creates something our VM uses (kind of like what wined3d does for the d3d shader bytecode, converting it to glsl). We may need to anyway, if there's cases where the bytecode is out of sync with the text version.
My first thought was, that this is not needed, because we can always regenerate the VM code from the script text. Are there any scenarios at all where the bytecode is out of sync with the source text and the bytecode is more up to date than source? I can not think of any.
The only situation that I can think of is a plugin with a script that fails to compile. Even if we are ignoring the fact, that such a plugin would be broken and we are under no obligation to handle it, I don't think the CS even allows saving in this case.
Chris
Posts: 1626
Joined: 04 Sep 2011, 08:33

Re: File Format

Post by Chris »

Zini wrote:Obviously a plugin that enhanced the UI would have to use Qt, but I do not see why that would be a problem.
People would tend to prefer a GUI app to use their WM's toolkit, and follow whatever UI standard there is for their system. The latter could be solvable by making the UI scriptable, but the former can't.
Components subsystems? We are already at that stage, at least for individual records.
The components subsystem is a bit too inclusive (contains all of nif, nifogre, nifbullet, bsa, etc). Plus, it's a static lib with a volatile API.
As far as I know MW loads all bsa available, independently from the used ESX stack.
Vanilla Morrowind only loads BSAs specified in Morrowind.ini, under the [Archives] section. A number of mods actually come with a tool that would "register" a BSA in the ini, since otherwise it gets ignored and the mod would have missing data.
Seems you have completely misunderstood me here.
I see. So then it's just a new header record (being, I assume, more strict with the expected values). In that case, I guess I don't really mind.

What I will say, though, is that I don't think there should be different extensions (i.e. omwbase/omwext) if there isn't a difference in the way the file's data is handled by the engine. I don't think the mere presence of a Masters list really qualifies, nor do I think the game should be forced to only ever have one data file without Masters (there are mods that don't list Masters because they don't need to, since they don't directly reference anything in an esm).
My first thought was, that this is not needed, because we can always regenerate the VM code from the script text. Are there any scenarios at all where the bytecode is out of sync with the source text and the bytecode is more up to date than source? I can not think of any.
I'm not sure, honestly. I've never done much Morrowind scripting myself (I did a bunch of Oblivion scripting, but I hated it, and Morrowind looks worse). I just think if we're going for correctness, we should be using the bytecode data that we know the vanilla game used, rather than the text data that was merely there for the editor.
User avatar
Zini
Posts: 5538
Joined: 06 Aug 2011, 15:16

Re: File Format

Post by Zini »

People would tend to prefer a GUI app to use their WM's toolkit, and follow whatever UI standard there is for their system. The latter could be solvable by making the UI scriptable, but the former can't.
Qt is very good at looking native under most environments. It shouldn't be much of a problem for most people. For some it would.
The components subsystem is a bit too inclusive (contains all of nif, nifogre, nifbullet, bsa, etc). Plus, it's a static lib with a volatile API.
Most components are independent from each other. No need to grab all of them and we can easily add a build option that allows for creating a dynamic library (or several). And the API should stabilise soon for most of them (or has done so already).
Vanilla Morrowind only loads BSAs specified in Morrowind.ini, under the [Archives] section. A number of mods actually come with a tool that would "register" a BSA in the ini, since otherwise it gets ignored and the mod would have missing data.
Interesting. I did not know that. Have to think about it.
What I will say, though, is that I don't think there should be different extensions (i.e. omwbase/omwext) if there isn't a difference in the way the file's data is handled by the engine. I don't think the mere presence of a Masters list really qualifies, nor do I think the game should be forced to only ever have one data file without Masters
Not sure if you can follow. While structurally identical omwbase and omwext are semantically two completely different file types. Consider going into a store to buy a game (yes, I know that is how our grandparents used to buy games, no jokes please). There you have boxes that are games and boxes that are addons. While they usually sit in the same shelf, for the end user they are not the same at all and having them mixed up can be quite unfortunate.
(there are mods that don't list Masters because they don't need to, since they don't directly reference anything in an esm).
Can you give me a hint on what mods that might be? I don't think I have every encountered something like this.
I'm not sure, honestly. I've never done much Morrowind scripting myself (I did a bunch of Oblivion scripting, but I hated it, and Morrowind looks worse). I just think if we're going for correctness, we should be using the bytecode data that we know the vanilla game used, rather than the text data that was merely there for the editor.
I don't think that would improve correctness. If source and bytecode do not match, the plugin is clearly broken (at least in the context of classic MW). Choosing randomly one option would not improve correctness.
Chris
Posts: 1626
Joined: 04 Sep 2011, 08:33

Re: File Format

Post by Chris »

Zini wrote:Most components are independent from each other. No need to grab all of them and we can easily add a build option that allows for creating a dynamic library (or several). And the API should stabilise soon for most of them (or has done so already).
Nif and NifOgre are still changing, and don't look to be stablizing soon. Honestly, the NifOgre interface is turning into quite the misnomer... the "public" functions and types aren't Nif-specific at all, and will be needed even when using native .skeleton and .mesh files (actually what it will handle is a mini "scene", since a scene graph is defined for an object; scene nodes, bones, entities, billboards, particle emitters, lights, controllers, etc). The esm API will change every time a new type is added.
Not sure if you can follow. While structurally identical omwbase and omwext are semantically two completely different file types.
Yeah, semantics. It'd be like having different text file extensions for original stories and fanfics. Identical in every technical and meaningful way, except one is based on previously-made work (which you may or may not have to know about to understand).

IMO, I think OpenMW should just worry about the file contents, and it's those contents that should specify how it's treated... the extension could be .g0bb1yGook, but as long as the contents are recognized, that's all the engine should care about. As far as the user is concerned, 'omwbase' vs 'omwext' says nothing important -- just that one has a list of masters, which is 99.9% of mods, and the other doesn't, while at the same time also not implying it can be used stand-alone.
Can you give me a hint on what mods that might be? I don't think I have every encountered something like this.
The MWSE demo esp is one. Don't forget too, with the way Morrowind handles record IDs a mod doesn't have to specify something as a master to change its records, since there's no differentiation between creating a new record and replacing an existing one. I wouldn't be surprised if a number of built-in "compatibility patches" work that way, overriding records that another mod has to make it compatible, without listing it as a master so it doesn't require that other mod to work (since if that other mod isn't enabled, the record is created but nothing references it; if it is enabled, the original record is replaced with one that makes it compatible).

As such, it's completely valid to have an esx with no masters, as long as it doesn't reference a record that it doesn't also define.
I don't think that would improve correctness. If source and bytecode do not match, the plugin is clearly broken (at least in the context of classic MW). Choosing randomly one option would not improve correctness.
It's not really "choosing randomly". It's choosing the one we know the vanilla game actually uses. It's only random if we can guarantee the two are 100% functionally identical in all cases.
User avatar
Zini
Posts: 5538
Joined: 06 Aug 2011, 15:16

Re: File Format

Post by Zini »

Nif and NifOgre are still changing, and don't look to be stablizing soon.
I wasn't thinking about these in particular. But we still have plenty of time before API stabilisation even becomes an issue. I doubt anyway would be crazy enough to start on additional tools before OpenMW 1.0.
The esm API will change every time a new type is added.
Extended, yes. Changed, rarely. We certainly can't provide a lot of ABI compatibility. But API compatibility should not be much of an issue.
Yeah, semantics. It'd be like having different text file extensions for original stories and fanfics. Identical in every technical and meaningful way, except one is based on previously-made work (which you may or may not have to know about to understand).
Sure. And while we are at it, lets also drop the .cpp and .hpp extenstions. Both file types contain C++ code, so what is the point in differentiating between them?
IMO, I think OpenMW should just worry about the file contents, and it's those contents that should specify how it's treated
OpenMW maybe. Other tools no.
As far as the user is concerned, 'omwbase' vs 'omwext' says nothing important
Sorry, but this sentence does not make any sense at all. Of course this is important to the user. An omwbase file represents a game. An onmext file represents an add-on. These are not the same thing. Ask anyone, who ever accidentally bought an add-on without noticing that a base game is required. I am sure that the reply would be loud and angry (that new MW content is unlikely to ever become commercial does not matter here).
just that one has a list of masters, which is 99.9% of mods, and the other doesn't
One master, not a list. Each valid content stack will always contain exactly one omwbase file. You can't have more than one. That would lead right back to the current master/plugin situation, where the distinction becomes pointless.
As such, it's completely valid to have an esx with no masters, as long as it doesn't reference a record that it doesn't also define.
Hm, that is potentially dangerous. Lets assume someone makes an add-on to MW, that adds content, but does not uses any existing records. Then someone tries to load this add-on with Redemption. That would work, but it would probably trash the game world (make the game unwinnable or something) and the launcher and the editor would have no means to detecting this kind of problem. Sure that would be an user error. But one that should be avoidable on the software side.
It's not really "choosing randomly". It's choosing the one we know the vanilla game actually uses. It's only random if we can guarantee the two are 100% functionally identical in all cases.
I don't think we are getting anywhere with this point and it is off-topic anyway.
Chris
Posts: 1626
Joined: 04 Sep 2011, 08:33

Re: File Format

Post by Chris »

Zini wrote:Extended, yes. Changed, rarely. We certainly can't provide a lot of ABI compatibility. But API compatibility should not be much of an issue.
If it's a shared lib, ABI compatibility will be very important. Unless you want to require all 3rd party utilities to be recompiled any time there's an update to the components libs.
Sure. And while we are at it, lets also drop the .cpp and .hpp extenstions. Both file types contain C++ code, so what is the point in differentiating between them?
Except those are meaningful. cpp files are meant to be compiled, while hpp files are meant to be included in a cpp file, but not compiled directly. It tells the user something, and they're used differently. By contrast, there's no difference with omwbase and omwext... users would enable them the same way, and the engine would use them the same way. The only difference is that one has a Masters list (which itself isn't even needed by the engine, since its the launcher that handles dependency ordering).
Of course this is important to the user. An omwbase file represents a game. An onmext file represents an add-on. These are not the same thing. Ask anyone, who ever accidentally bought an add-on without noticing that a base game is required. I am sure that the reply would be loud and angry (that new MW content is unlikely to ever become commercial does not matter here).
Sure, but that's in the documentation. Whoever provides the data files needs to be clear with what other content is needed, whether it's vanilla MW mod(s), OpenMW-specific mod(s), the meshes/textures/sounds from another mod, or some combination. A user seeing 'omwext' isn't told whether it needs redemption.omwbase, morrowind.esm, or whatever else, so they have to look at the documentation to see what it relies on anyway.

If someone neglects to say what their plugin relies on, 'omwext' isn't going to tell them. All they'll know is that it relies on something, which is the case for 99.9% of esx files.
One master, not a list. Each valid content stack will always contain exactly one omwbase file. You can't have more than one. That would lead right back to the current master/plugin situation, where the distinction becomes pointless.
Then how would omwext files specify that they rely on other omwext files? The editor would need to know so it can tell what other records can be referenced, and the launcher would need to know so it can make sure they're all sorted somewhat correctly.
Hm, that is potentially dangerous. Lets assume someone makes an add-on to MW, that adds content, but does not uses any existing records. Then someone tries to load this add-on with Redemption. That would work, but it would probably trash the game world (make the game unwinnable or something) and the launcher and the editor would have no means to detecting this kind of problem. Sure that would be an user error. But one that should be avoidable on the software side.
Yeah, it's a problem. It's also why I (constantly :P) say I don't like the way MW handles records. In Oblivion and Skyrim, records are referenced according to a 32-bit index, and that index value is influenced by the masters list (i.e. the first master's records are declared as 0x00xxxxxx, second master's as 0x01xxxxxx, etc, and the newly-defined records take the next high value after the last master; the engine then fixes up these values as it loads a plugin, by cross-referencing the plugin's masters list and the actual plugins loaded for the game).

Unfortunately there's no clear answer here. Morrowind's system could be converted to Oblivion's and Skyrim's by auto-generating indices for records, giving them an appropriate index values based on the masters list and where a record first appears for a given esx file (we could also use a 64-bit integer, raising the plugin "limit" to 4+ billion instead of 255). This would make it impossible to do "blind edits", handling your concerns, but it would also break mods that expect to be able to blindly edit another mod's records as a means of compatibility with it. Unfortunately the two ideas seem to be incompatible.
User avatar
Zini
Posts: 5538
Joined: 06 Aug 2011, 15:16

Re: File Format

Post by Zini »

If it's a shared lib, ABI compatibility will be very important. Unless you want to require all 3rd party utilities to be recompiled any time there's an update to the components libs.
I would assume you would have to do that anyway, if you move to a newer version of OpenMW. Obviously you can have several versions of that libraries around, so if you have a non-updated tool and only want to work with older files, everything is fine.
If you want to handle an upgraded file format (with new recrods/sub-records), at least rebuilding against an updated library seems a reasonable requirement to me.
Except those are meaningful. cpp files are meant to be compiled, while hpp files are meant to be included in a cpp file, but not compiled directly. It tells the user something, and they're used differently. By contrast, there's no difference with omwbase and omwext
And this is the point where we totally disagree, it seems. omwbase and omwext are used different, as I have trying to lay out for some time now and I am running out of ideas how to explain it differently.
users would enable them the same way, and the engine would use them the same way
I agree on the second part, but I completely disagree with the first part. omwbase files represent different games. The user will select the game he wants to play in the launcher. And then the user will be presented with a list of valid add-ons for this game (we are already doing that to some degree, I think). Once we actually have more than one onwbase file, this will be an essential tool to manage content.
Sure, but that's in the documentation. Whoever provides the data files needs to be clear with what other content is needed, whether it's vanilla MW mod(s), OpenMW-specific mod(s), the meshes/textures/sounds from another mod, or some combination.
Documentation alone is not enough. Our tools need a way to partition the set of content files into "game universes" (for a lack of better term), Otherwise the launcher wouldn't be able to do its job. Same for a package management tool, if we ever decide to write one. And the easiest way to do that is to have a single base file per universe, that functions as a root of the dependency tree.

btw. if we would both allow a no-dependency type plugin and go for the one file type approach, everything falls apart (not that I propose to go down that route). The launcher would have no way to determine if a file without dependencies is something like Morrowind.esm or some plugin that just does not reference any existing content.
Then how would omwext files specify that they rely on other omwext files? The editor would need to know so it can tell what other records can be referenced, and the launcher would need to know so it can make sure they're all sorted
somewhat correctly.
omwext files can specify that they depend on omwext files. But not on multiple emwbase files. I guess that is what you meant by master list? The term is a bit mushy.
Yeah, it's a problem. It's also why I (constantly :P) say I don't like the way MW handles records. In Oblivion and Skyrim, records are referenced according to a 32-bit index, and that index value is influenced by the masters list (i.e. the first master's records are declared as 0x00xxxxxx, second master's as 0x01xxxxxx, etc, and the newly-defined records take the next high value after the last master; the engine then fixes up these values as it loads a plugin, by cross-referencing the plugin's masters list and the actual plugins loaded for the game).
Actually, the situation is not that dire. I think we can safely ignore esp files without dependencies, because formally they are invalid. There is no way to produce one with the CS. Any plugin of this type would have been hacked together in some way and the only example for this type of file listed here so far is for a 3rd party tool that we don't support anyway.
Chris
Posts: 1626
Joined: 04 Sep 2011, 08:33

Re: File Format

Post by Chris »

Zini wrote:If you want to handle an upgraded file format (with new recrods/sub-records), at least rebuilding against an updated library seems a reasonable requirement to me.
If you want to handle those new (sub-)records, sure, but handling existing records shouldn't require a recompilation, IMO, since they're unchanged. Particularly also if it's an old file that couldn't have been made with those new (sub-)records even if it wanted.
omwbase files represent different games.
How many omwbase files do you expect? I can think of two off the top of my head.. the Redemption TC, and our Example Suite. In almost 11 years, there's only one project that's even close to being a stand-alone game on the level that Morrowind.esm is. I can't even think of any in the making for Oblivion (that one by SureAI doesn't count since it does still require Oblivion's data files). With so few and far between, is an omwbase file extension really useful to people?

I think at this point, we're just not going to agree. Not the least because it's still a far ways off before this even becomes an issue (we should be getting to 1.0 before worrying about adding or replacing stuff that vanilla MW might have trouble with).

My final comment on the issue is that I don't like the 'omwbase' and 'omwext' extensions aesthetically. IMO, if it's contracted it should try to stick to three characters, and if you're going to go beyond that limit you may as well go all the way so it's clear. Could even go a little crazy and use a "double extension":
.openmw.game
.openmw.addon
.openmw.save
omwext files can specify that they depend on omwext files. But not on multiple emwbase files. I guess that is what you meant by master list? The term is a bit mushy.
Yeah. A list of content files that defines records that it references. It's a list of required files that must go first (perhaps "parents" would be a better term than "masters"?).
There is no way to produce one with the CS. Any plugin of this type would have been hacked together in some way and the only example for this type of file listed here so far is for a 3rd party tool that we don't support anyway.
Bit of an aside, but I assume we're going to have MWSE compatibility? Not support MWSE itself, but implement the scripting extensions it introduces, since there's plenty of mods that use them. The MWSE demo esp is basically the "toddtest" of MWSE features.

To be clear, I don't have a real issue if you don't want to support esx files that don't have a list of masters; they're rare enough and relatively easy to fix (could probably even add an option to esmtool to fill an empty list of masters with something).
User avatar
Zini
Posts: 5538
Joined: 06 Aug 2011, 15:16

Re: File Format

Post by Zini »

If you want to handle those new (sub-)records, sure, but handling existing records shouldn't require a recompilation, IMO, since they're unchanged. Particularly also if it's an old file that couldn't have been made with those new (sub-)records even if it wanted.
It depends on what you mean by handling. If there is any processing and modification involved (i.e. loading a content file, modifying something and then saving again) using a newer version with additional records should definitely require recompilation, even if the new records are not touched.
I general I do not think it is unreasonable that, if you want to handle a newer version of a file format, you also should update to a new version of the library used to handle this file format; which may or may not require rebuilding a tool using this library.
How many omwbase files do you expect?
We have Morrowind, Redemption and the OpenMW Example Suite. I am fairly certain that there was another TC floating around, probably in the German modding community. So that would make 4 then.

But with the extended modding capacities of OpenMW and the improved easy of use and productivity of the new editor, I expect a substantial re-vitalisation of the MW modding community. It is very well possible, that we will see one or more additional TCs in the coming years. Let's work with an optimistic view of the future.
Not the least because it's still a far ways off before this even becomes an issue (we should be getting to 1.0 before worrying about adding or replacing stuff that vanilla MW might have trouble with).
Not want to start nitpicking here, but the whole reason for this thread is, that our new editor (which hopefully will release not too much later than OpenMW 1.0) is not able to produce files that can be loaded by MW (because of the VM issues). Otherwise I wouldn't have started the discussion, because content file format improvements are indeed a clear post-1.0 topic.
My final comment on the issue is that I don't like the 'omwbase' and 'omwext' extensions aesthetically. IMO, if it's contracted it should try to stick to three characters, and if you're going to go beyond that limit you may as well go all the way so it's clear. Could even go a little crazy and use a "double extension":
.openmw.game
.openmw.addon
.openmw.save
This is indeed mostly an issue of aesthetics. I am outright against the three character convention. It was a poor idea to begin with and hasn't gotten any better since MS-DOS. For anything else, I honestly don't know. Maybe we should put up a list of possible candidates for public voting, since from a technical perspective it really does not matter.
perhaps "parents" would be a better term than "masters"?).
Makes sense. Let's try to stick to that in the future.
Bit of an aside, but I assume we're going to have MWSE compatibility? Not support MWSE itself, but implement the scripting extensions it introduces, since there's plenty of mods that use them. The MWSE demo esp is basically the "toddtest" of MWSE features.
That is not decided yet. Personally I am not in favour of it; for various reasons. I suggest we have this discussion another time, since it is definitely not relevant to 1.0 development.
To be clear, I don't have a real issue if you don't want to support esx files that don't have a list of masters; they're rare enough and relatively easy to fix (could probably even add an option to esmtool to fill an empty list of masters with something).
I am not outright against them, at least not in combination with the 2 file scheme. But I don't think the amount of work necessary to make them function reasonably well is worth the effort. Lets forbid them for now and if we end up with a mob of angry modders who demand support for them, we can add it later.
Post Reply