Re: i18n and shell scripts

From: Tim Kientzle <tim_at_kientzle.com>
Date: Sun, 22 Jan 2012 14:30:39 -0800
On Jan 22, 2012, at 2:05 PM, Ron McDowell wrote:

> I'm working on the new bsdconfig, and looking for some good examples of how to incorporate internationalization into the scripts.  I'm not finding much love in this area. :-(
> 
> Any pointers appreciated.


GNU gettext may not be an option here, but the documentation at

   http://www.gnu.org/software/gettext/manual/gettext.html#sh

outlines how this can be done.

The general model used by gettext and lots of other translation systems is:
   * There's a distinctive function name or source code pattern wrapping the English text strings.
   * So you can run a simple program that identifies and extracts these strings.  (Simple sed or awk scripts often suffice.)
   * Translation files map English text ==> translated text.  (There are many more-or-less standard formats used for translation files; the .po format used by gettext is widely supported by translation tools.  In particular, there are a number of commercial translation management interfaces that allow free use by open-source projects and can directly upload/download po files.)
   * Translation files can be compiled into some dictionary structure that can be used efficiently at run time.
   * The "distinctive source code pattern" above is typically a function name.  At run-time, that function looks up the English text string from the source file and emits the result.

Nothing here is difficult to just build from scratch.  The complex part is the process for actually keeping track of what has and hasn't been translated across a large number of languages.  Fortunately, there are some pretty good translation management systems on the web where you can upload .po files, invite people to contribute translations and then download constructed .po files for each language.  At work, we've started using getLocalization.com, which looks pretty promising, but there are many others.

Warning:  Translating quantities (e.g., "$x bytes") is complicated (why does 'zero' take a plural in English?).  You can sometimes just avoid it ("size in bytes: $x") and sometimes get by with stilted language (e.g., "1 bytes" or "$x byte(s)").  If you require high-quality handling of phrases that include variable numbers, then you'll need something more complex than just a lookup table.

Cheers,

Tim
Received on Sun Jan 22 2012 - 21:30:46 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:23 UTC