Improved File Format for omwaddon files

lambda · Post by **lambda** » 31 May 2020, 15:55

ponyrider0 wrote: ↑30 May 2020, 19:23 Representing a mod file as a series of SQL statements is basically what is already done by WeiDU mods. Certain operations like generating or modifying records based on a procedural algorithm can be very efficiently represented as a few lines of code. One of the downsides of this approach is the processing time required to generate those records at mod-installation time or at game-launch time. Another downside is the inefficiency in storing blocks of data that can not be procedurally generated. In that case, representing the records as raw table data would be more efficient than trying to encode the data as a series of SQL insert statements. Probably the most significant downside to mods based on SQL statements is a very steep learning curve to create mod packages. In the Infinity Engine/WeiDU modding community, this steep learning curve has resulted in large number of mods being developed by non-coders using a game database editor. These non-coding mod-makers then have to rely on a very small number of mod-makers with WeiDU coding skills to help them package their mod into an installable WeiDU package.

I do not have much to say to your points as it regards OpenMW, but the statements regarding the IE Engine and WeiDU are puzzling and not reflective of my experience at all. Yes, most mods are cobbled together using NI or dltecep, but a "These non-coding mod-makers then have to rely on a very small number of mod-makers with WeiDU coding skills to help them package their mod into an installable WeiDU package."? Definitely not my experience at hanging out at G3 and SHS (do not know about the beamdog forums). And the WeiDU language is just plain horrible, which is somewhat surprising given that the tool is coded in OCaml and one would expect that any user of one of the ML family of languages would come up with something better. In other words, part of the steep learning curve, which is there alright, is an artifact of the API if I can (ab)use this term.

But this is tangential to the OP, so leaving it at that.

ponyrider0 · Post by **ponyrider0** » 31 May 2020, 19:08

lambda wrote: ↑31 May 2020, 15:55 I do not have much to say to your points as it regards OpenMW, but the statements regarding the IE Engine and WeiDU are puzzling and not reflective of my experience at all. Yes, most mods are cobbled together using NI or dltecep, but a "These non-coding mod-makers then have to rely on a very small number of mod-makers with WeiDU coding skills to help them package their mod into an installable WeiDU package."? Definitely not my experience at hanging out at G3 and SHS (do not know about the beamdog forums).

Yes, your experience is different from mine and that's fine. I've removed the subjective, emotionally-loaded words from my previous post. However, I still stand by my assertion that the steep learning curve in WeiDU scripting compared to the relatively lower learning curve required to use one of the GUI game editors is indeed real. In my opinion, it poses a barrier to entry into the mod-making community; discouraging some people from sharing their private mods -- especially smaller mods that they might feel are not worth the effort. I think it is a testament to the passion and dedication of the Infinity Engine/WeiDU community that it is as large and long-lasting as it is in spite of this and other barriers.

So the lesson for me is this: defining/creating a mod format based on a programming language is incredibly powerful, but we should take care NOT to require any scripting knowledge for making and distributing mods. Otherwise, we risk stifling creativity and limiting growth potential of the community.

AnyOldName3 · Post by **AnyOldName3** » 31 May 2020, 21:39

And the WeiDU language is just plain horrible, which is somewhat surprising given that the tool is coded in OCaml and one would expect that any user of one of the ML family of languages would come up with something better.

I've heard some good things about OCaml, but my experience of Standard ML at university was guessing which random placement of brackets the compiler had decided it wanted that day, despite the spec seeming to tell me that the first thing I tried was the only version that could possibly work. The family isn't all good.

psi29a · Post by **psi29a** » 01 Jun 2020, 00:38

My professional experience using OCaml has me avoiding it at all costs. We had all kinds of problems with their LWT implementation, for example.

ponyrider0 · Post by **ponyrider0** » 01 Jun 2020, 07:16

bmw wrote: ↑30 May 2020, 21:11 I think this is mostly a problem with the fact that, if I understand correctly, you are more using this system for tracking changes and you aren't modifying the json directly. Modifying the editor to produce more consistent output would help, but in the end, the best way would be to have the editor also handle the text format, which should allow it to produce changes which are sane even for multi-record files.

I don't think we are on the same page here. Instead of trying to describe the use-cases/scenarios and the desired results vs actual results, I will demonstrate with an actual production example. It will take me a few days to reproduce them and repost them to github.

Writing 450 000 files can't be particularly fast, but this sort of thing should be able to be done in much less than 4 hours.

Agreed. I'm sure this can be improved dramatically... to a certain extent. Even just a Windows 10 folder copy procedure takes 50 minutes on an SSD connected with SATA600. So the filesystem performance bottleneck would still be noticeable even if other export operations were reduced to zero time.

ponyrider0 wrote: ↑30 May 2020, 07:11 Another issue with the current design is that completely deleting a record/file in an ESP can not be propagated to a master with a simple file-tree copy procedure. My current plan is to leverage the ESM format's "Delete" flag bit to mark a record as deleted, then these files can be purged in a post-processing step at any point after merging into the master repository.
It should using a git commit or a patch file though. Why are you copying file trees to make changes to the master? Is it because you're modifying the esm, exporting a new tree, and copying it onto the old one? I would think you could just replace the old tree with the new one (or use something like "rsync -a --delete" to apply the changes).

Sorry for the confusion: I'm using the phrase "simple file-tree copy procedure" as a description of what happens when two completely independent repositories are merged, whether it is by a "git merge" procedure or by a manual filesystem copy.... I will elaborate on this as well in a few days using the example I will be reposting to github as mentioned above.

Thanks for everyone's comments and suggestions thus far. Hopefully, you all are able to see these posts for the insights related to practical issues with using and designing a text-based/source-code mod format rather than just a series of posts going off-topic.

bmw · Post by **bmw** » 09 Jun 2020, 03:58

ponyrider0 wrote: ↑01 Jun 2020, 07:16 Sorry for the confusion: I'm using the phrase "simple file-tree copy procedure" as a description of what happens when two completely independent repositories are merged, whether it is by a "git merge" procedure or by a manual filesystem copy.... I will elaborate on this as well in a few days using the example I will be reposting to github as mentioned above.

I still don't really see why you would be merging two repositories if they are completely independent, but I agree that we're not really on the same page, so I'll wait and see what your production example consists of, which hopefully will make things more clear.

ponyrider0 wrote: ↑31 May 2020, 19:08 So the lesson for me is this: defining/creating a mod format based on a programming language is incredibly powerful, but we should take care NOT to require any scripting knowledge for making and distributing mods. Otherwise, we risk stifling creativity and limiting growth potential of the community.

I certainly agree that having a scripting interface be the primary method of writing plugins would be a bad idea, but at the same time I think having an optional script interface of some sort would be useful.

There are mods such as (admittedly examples for skyrim) True Unleveled Skyrim, True Medieval Economy, True Realistic Item weights, and Total Equipment Overhaul, and for morrowind there's raremagic4openmw.
In the former cases, my experience was that generating the plugins is painfully slow (between all the plugin generators it takes hours on a large modlist), all the more painful for being stuck behind a clunky javascript UI which is glitchy in Wine. Having a built-in scripting system for generating such mods would bring the benefit of a more efficient implementation than the various things people have cobbled together (not that they're all bad, but I think we can do better), though admittedly it still wouldn't do anything for poorly optimized scripts, short of creating a Domain Specific Language that only can be used to operate on the data files in linear time.
I'm not that familiar with the plans for openmw's lua scripting engine, but my understanding was that it is more for runtime scripts than scripts that pre-process data files. Perhaps it would be possible to have part of that engine do pre-processing in this manner and automatically cache the results, re-generating them when the mod configuration has changed?

At the same time the current situation isn't that bad when combined with a package manager that can handle regenerating the files automatically, but that only works if the generator can run headless, and being able to use a built-in scripting engine would remove the temptation to make generators more "user-friendly" by wrapping them in a GUI, as the user would just need to install the scripts and let the engine handle the rest.

A couple more ideas:

Operations:
I think that having some sort of operations be available for the "delta"-style records will be useful, given that they're already needed to handle sets and maps, and having a unified way of handling such modifications could be useful, compared to the more ad-hoc system I have implemented at the moment.

E.g. For a record Foo with a set of flags, in addition to simply overriding with a completely new set you should be able to add and remove flags from the set in the master's version of the record (I have the add and remove operations already working in deltaplugin, albeit with a slightly different syntax)

Code: Select all

DeltaFoo:
    flags:
        add: [foo, bar]
        remove: [baz]

That is, while the value of flags in Foo would just be a set of instances of a certain enumerated type, the value of flags for DeltaFoo could be either the same set of flags, for doing overrides, or a mapping between operation names and a value of type specific to the operation (operations would be applied in order of declaration).

This would extend easily to other field types such as strings, where you could have things like append and substitute operations.

It would be less flexible than a string-based expression where you could make use of variables and references to other records, but more structured and easier to define.

Localization:
The other thing that had occurred to me was that we should really, at the very least for the text-based format, have a way of internationalizing plugins by including all the localizations within the same directory tree (i.e. one plugin, multiple localization files, rather than a plugin for each localization). I'm not super familiar with localization systems, but some quick research turned up GNU gettext, Project Fluent, and International Components for Unicode (ICU). Does anyone have any thoughts or experience in terms of the most suitable localization system?
Fluent is very modern and looks quite nice to work with in a text editor, unlike editing gettext's po files. There isn't a c++ library at the moment, however it has good rust support and I could probably create a wrapper around the Rust library without too much difficulty (I'm leaning towards this option at the moment).
ICU's documentation is messy and unclear, and I feel like it would be more of a hassle to work with than the other two.
Gettext's translation files seem like a pain to deal with, but there are also GUI editors like lokalize and gtranslator, which some people might prefer to using text editors (though they're only really necessary because gettext's format isn't nice to work with directly...).

Then there's the question of whether localization support should be included in the binary format as well, or if plugins should be localized at release time when the text format is converted into the binary format. I don't feel like localizing text is a particularly performance-sensitive situation, and the engine would need to support localization if it is to support the text format. Plus there's the fact that all of these localization languages are more expressive than the current format strings (pluralization, for example) so it will probably take significantly more effort than just doing localization at runtime.

openmw.org

Improved File Format for omwaddon files

Re: Improved File Format for omwaddon files

Re: Improved File Format for omwaddon files

Re: Improved File Format for omwaddon files

Re: Improved File Format for omwaddon files

Re: Improved File Format for omwaddon files

Re: Improved File Format for omwaddon files