Openapoc: Replace tinyfmt with fmt (https://github.com/fmtlib/fmt)

Created on 16 Sep 2019  ·  9Comments  ·  Source: OpenApoc/OpenApoc

fmt:

  • Allows positional arguments to ease translation (e.g. the string format("Do {0} to {1}", thing1, thing2) could be "translated" as "Do {1} to {0}", to allow different ordering in the sentance
  • Currently in the draft c++20 standard, so pretty likely to be in standard libs soon (fewer dependencies and wider use and programmer exposure is good)
  • Faster compile times
  • Fast run times

compared to tinyformat.

Enhancement Feature Request

All 9 comments

Is this being worked on?

I've got the initial port working locally, but haven't had time to polish it and properly test (e.g. all the translation strings need to be updated, so I was using this as a good point to make the strings extracted automatically from code with xgettext as a build target etc)

It also revealed a number of "bad" translation policies - e.g. stitching together translated text fragments instead of doing the wrong string at a time (e.g. something like "format(tr("%s, %s"), format(tr("%s thing is %d out of "), object, count), format(tr(out of "%d total"), number));

Which would result in the completely useless strings for translators:
"%s %s"
"%s thing is %d "
"out of %d total"

when it really should be translated as a single call, so the extracted string should be:
"%s thing is %d out of %d total" (or "{0} thing is {1} out of {2} total" in the fmtlib way of doing things)

From what I can gather there has to be two distinct formatting subsystems:

  1. {fmt} based logging (since IMO there's little benefit in translating potentially ever-changing log messages, and would be a bit hit to performance routing them via boost::locale.
  2. boost::locale::format for localizable UI strings (since it has support for plural forms).

I would propose implementing them in small steps: first switch to {fmt} then deal with localizable strings.

I am unsure if boost::locale is actually going to be that useful for us going forward - it's great for static applications with static translation databases, but I have struggled to find where I can hook into it's message database to add in new messages from mods.

I think it's only possible to add a completely new database in a new translation domain, and then we get into issues trying to figure out which domain we should translate a string with, without doing something like trying /all/ mod domains in order until we find a match

So my current plan is to try to separate each of the tasks - similar to what you mentioned:

  • Use {fmt} for all logs and internal strings (it's used in some places for constructing IDs etc - something we explicitly don't want translating)
    -- This will involve significant string changes from %printf style to {fmt}, but ideally not affect the translations
  • Wire up the .pot generation automatically using xgettext
    -- This will likely highlight much of the "bad" translation work - some people seem to have run off with the OG .exe extracted strings as a "golden" translation message source, despite openapoc's strings being constructed very differently in places
  • Check the source for "bad" translation string construction (like the example I mentioned above), or strings that shouldn't be translated (logs, internal ID strings etc.) or strings that should be translated but aren't (IE just directly using ::format())
    -- This should be made easier by the previous step, as it should be obvious in the generated .pot
  • Convert all /translated/ strings from %printf style to {fmt}/boost::locale style (I think they're functionally the same as we currently don't have any pleural annotations or similar)
    -- This will invalidate pretty much all current translation work - so probably not worth doing any translation updates between stages

I have pretty much all the above stages locally in various states - it probably makes sense for me to try get them into a usable state, and now I have a bit of time this long weekend I guess that'll be my next task :)

Once that is all done, we can then start to worry about how mods would work that add new strings

Convert all /translated/ strings from %printf style to {fmt}/boost::locale style (I think they're functionally the same as we currently don't have any pleural annotations or similar)

Actually they only look similar -- boost::locale features 1-based indexing and vastly differing formatting grammar. If not for plural forms I would go with {fmt} since I find its support for named argument a good help for translators adding valuable context, e.g:

Delay = {seconds}
Range = {meters:2.1}m.

Though I find support for plural forms much more valuable.

Yeah, I remember the indexing throwing me to begin with :)

TBH going forward I think the boost::locale format is good to go, I'm sure we can fix the mod context issues going foward.

Or at least then when we find the "next" problem all the translated strings will be in a much more sane format and not intermingled with non-translated formatted text.

Also, honestly the entire UString class should be a set of helpers around a std::basic_string - but that looks like it won't be reliably around until c++20.

I'm temped to just replace it all with a std::string (IE std::basic_string), but then we'll lose type safety for "known utf8" vs "array of bytes" - but I'm not sure we're really using that anyway. We'd just have to be careful at interfaces, but we aren't careful today, just calling .cStr()/.str() without any real conversion as a 'get out'

IT currently 'happens' to work as I tend to test on platforms that use utf8 in their system APIs already - but I suspect it'll currently break if we try to do something today like open a filename with a non-ascii path on windows

Was this page helpful?
0 / 5 - 0 ratings

Related issues

FilmBoy84 picture FilmBoy84  ·  3Comments

FilmBoy84 picture FilmBoy84  ·  3Comments

FilmBoy84 picture FilmBoy84  ·  3Comments

Quickmind01 picture Quickmind01  ·  3Comments

makus82 picture makus82  ·  3Comments