initial
This is a proof of concept for multi-language text (where the language is chosen at runtime) with compile-time checks:
The complete text in all languages is embedded in the binary.
The folder "i18n
" contains several csv
files with four columns: a language identifier, a text identifier, the text, and (optionally) a comment.
build.rs
reads and analyzes these files and generates code based on them:
The language identifiers across all csv
files are collected and an enum Language
with one variant for each language is created:
If the languages En
, Ja
, De
, and Fr
are present, build.rs
will create the enum:
enum Language {
De,
En,
Fr,
Ja,
}
Similarly, the text identifiers are collected and turned into an enum Text
with one variant per (unique) identifier.
The text for each language and identifier is collected and compressed, and included in the binary. The offset of each individual text within the whole is also compressed and included in the binary.
Comments are not used and not included in the binary. They are intended to provide space for clarifications, TODO markers, etc.
At runtime, the text and all offsets are uncompressed at the start.
Then, the Language
and TextId
enums can be used to look up the location of the corresponding text in the collection.
Because static strings are not enough for internationalization, some kind of runtime string interpolation is needed. Here, an extremely basic version is implemented.
The program is extremely basic: It prints the current date, asks for your name and then says hello. Choose the language by handing a command line argument.
$ cargo run ger
Heute ist der 16. 5. 2023.
Wie heißt du?
> Marvin
Hallo Marvin.
$ cargo run 日本語
今日は2023年5月16日です。
名前はなんですか。
> Zaphod
Zaphod、こんにちは。
$ cargo run en
It's 5/16/2023.
What's your name?
> Arthur
Hello Arthur.
For production use, language-specific formatting for dates and numbers (decimal separators) would be required. This could be done at a higher level by first formatting the date and then interpolating the formatted date into the string. Here it's done by specifying the date order in the .csv but I'm not sure this approach would fly for bigger projects.
Runtime text interpolation (the user's name) is only checked at runtime.
It would be great to verify at compile time that the value to be interpolated is guaranteed to be present.
Unfortunately, I'm not sure how to do that.
My best idea is a macro fmt!(dictionary, language, text_id, "key1"=value3, "key2"=value2, "key3"=value3, ...)
that reads the csvs and checks that every key which appears in the text text_id
for any language also has a corresponding "key"=value
argument.
However, this would only work if the text_id
is used verbatim at compile-time (which may be a reasonable assumptions).
(Checking that every "key"=value
argument also appears in the text for every language is probably a bad idea, but checking that every "key"=value
argument appears in some language may be reasonable.)