EvilEye wrote: ↑20 May 2020, 15:53
Almost every time I do a bit of modding I think it'd be nice to be able to use git to create the ESP. And in the context of TR and PT, I'd like claims to be mergeable using PRs/MRs. Now I only did a bit of thinking and never implemented anything, but these were the steps/features I had in mind.
Create a utility to turn an ESP into a directory structure containing yaml files
Group record types into directories
Group dialogue topics into a directory per topic
Create a lock file (in the spirit of npm and yarn) to allow reproduction of the original ESP (byte for byte)
Create a utility to turn a directory structure into an ESP (with or without a lock file)
That's also more or less what I had in mind, unless I'm misunderstanding anything.
The output of the tool should be able to be deterministic by default, except for content from master files when converting to ESPs (though, as mentioned before, it would be ideal to move away from requiring copying master information into plugins). There's also a sort of lock in esps already in the form of the masters list, but it could certainly be improved (size is a poor hash function).
I'd also rather not focus too much on binary equivalent reproductions of esps, as there are things in the format such as filler data in fixed-length strings which would be really annoying to have to reproduce (I don't know how the original engine handles these, or even if there is much variation, but if the engine ignores everything after the first null byte then there could be arbitrary data stored in the rest of the subrecord which we would need to track), and not really worth it given that my thought would be to use the yaml files as the authoritative version of a mod, and the only reason for such exact reproductions would be for validating the translation tool.
I've already implemented splitting out scripts into their own files (the implementation could use improvement, but it's early days yet), and my thoughts were to have support for include directives so that you could have a main file which includes a bunch of other files, or directories full of files. I don't think grouping record types into directories should be a required thing, but it could certainly be an optional way to structure a project.
Chris wrote: ↑21 May 2020, 15:40
A text-based format sounds like a really bad idea. The main benefit of being text-based, human-readability, quickly goes out the window with non-small files. Similarly if you try to enforce rules for data structuring, it's easy to create a monster that is not pleasant to create, edit/diff, or use. On top of that, the resulting file size will be significantly larger, taking more time to load, and be all-around more problematic to load (many more opportunities for invalid input that needs to be checked). Additionally, you have to be more careful of formatting; DOS or Unix line-endings? Errant formatting or escape characters? Code page encoding (ascii-7, extended latin, russian, utf-8, utf-16 BE/LE, ...)? Especially on Windows, it's easy for things to unknowingly get changed on you.
Any potential benefit is lost by the time a project gets to medium size, while also creating more baggage and additional problems.
It would certainly be possible to also have a binary format that is equivalent to the text format which it could be transcoded into at release time.
Cap'n Proto was something I'd briefly looked at, being a binary format with no parsing cost, but I thought I'd stick with focusing on text formats in my prototype tool (though any
format supported by serde could actually be used with an extremely small code change).
As for problems such as line endings, formatting, encoding and large file sizes, those are the same problems that software development has always had to deal with, and the solutions are no different than any other situation. Large file sizes can be mitigated by breaking them up into smaller, meaningfully structured files, and using include directives of some sort to link the files together. Inline comments also help significantly in improving the readability of text files. Encoding should be standardized (the yaml spec calls for utf-8, utf-16 or utf-32, for example). Most markup parsers will ignore errant formatting, and DOS line endings are just Unix line endings with extra trailing whitespace on each line.
Admittedly there may be issues when whitespace denotes scope, which makes yaml a potentially problematic language to use as it would be easy, particularly for the inexperienced, to accidentally break files with incorrect formatting. That being said, it does also support braces to denote scope, the use of which could be encouraged to avoid such problems. I don't think yaml is the perfect language for the job, but I've yet to find a suitable replacement.
AnyOldName3 wrote: ↑19 May 2020, 23:37
An ESP is already kind of a diff, though.
The trouble is that the control you have with such diffs for the current format is very coarse, as it only allows changing entire records as a whole, which leads to a bunch of issues. The sort of format I'm proposing would allow diffs at the smallest level possible (since for most fields, it's usually not meaningful to change just part of a field at a time.
Scripts and books on the other hand...). Also text diffs of those diffs, but I think the only advantage there would be the ability to use existing diff tools to handle them.
AnyOldName3 wrote: ↑19 May 2020, 23:37
it might be good to get people like ElminsterAU involved in that discussion as I'm sure it would be useful for the later games, too.
Yes, feedback from people developing tools for the later games would be useful, though I don't really know what the best way to go about doing that would be.
Greendogo wrote: ↑19 May 2020, 23:32
I definitely support this. We tossed around doing a text-based format long ago, but it was never picked up. The biggest bonus here is being able to do version control adequately. I like the main focus of your proposal of course, but for me the text based aspect is the major win.
The text-based aspect is a hugely significant to me too, particularly since getting mods into version control and hosted on places like GitLab and GitHub, or wherever, could also help produce a much more open modding ecosystem than we currently have. I was pushing the improved way of modifying masters in my original post mostly because I thought that was the more significant part of the suggestion (and you saying that you'd tossed around the idea of text-based plugins before confirms my suspicion that this isn't an entirely novel idea).
I don't suppose anyone knows of any other fundamental flaws with the current system that this might not address? If we're going to create a new system, it would be best to spend the time to do it right, rather than figuring out later that it's also flawed.
One further thought that has occurred to me is that including ways of linking field contents together might be useful. That is, you could have a value for an object introduced in a plugin file be dependent on a value of a different object which was introduced in its master file. That way the field could be meaningfully updated when the files are processed, even if the master value is different than it was when the plugin was written.
Instead of constants, fields could be expressions, and the "delta" records could make use of the expression in the field they are replacing as a variable in their expression. This might increase complexity significantly (and loading time), but even an extremely minimal expression language could be powerful (e.g. basic arithmetic operations, string concatenation and substitution). It might be more trouble than it's worth, but it's at least worth consideration.
One example I could think of where this might be useful is relative coordinates. Instead of all objects having an absolute position (not that I'm entirely sure how this works at the moment; I haven't gotten there yet in my record implementations), you could have some objects be placed relative to another object. It certainly wouldn't be perfect, as there are still many ways that moving an object could result in invalid positions for the related objects, but objects on top of furniture comes to mind as an instance where it would almost always work properly (then again this would also require some sort of reference frame to handle rotation).
Another is typos. A string substitution operation could easily fix typos in the base Morrowind files no matter what other changes are made to the text in the record prior to the substitution being applied (and fix any replicated typos in other modifications made to the text).