Translations / i18n / localizations for programs

From Free Pascal wiki
Revision as of 23:23, 12 August 2015 by 007 (talk | contribs) (Automatic translation: final steps)

Deutsch (de) English (en) español (es) français (fr) 日本語 (ja) 한국어 (ko) português (pt) русский (ru) 中文(中国大陆)‎ (zh_CN)

Overview

This is about how a program can use different strings for various languages like english, chinese, german, finnish, italian and any other language.

Quick i18n

This is intended as be a quick guide to introduce yourself in the world of translations and get things done quickly. Check also Language Codes and BidiMode.

For your information the most used languages in the world (source) are:

   * 1 - Chinese: With more than 1.2 billion native speakers in the world
   * 2 - Spanish: Spanish occupies the No. 2 spot and is spoken in approximately 30 countries.
   * 3 - English: 335 million worldwide—about 5% of the world’s population. (This does not takes into account second language spoken).
   * 4 - Hindi: Spoken by over 260 million people.
   * 5 - Arabic: Spoken in almost 60 countries around the world.
   * 6 - Portuguese: Population of Brazil is over 200 million. The population of Portugal is just over 10 million.
   * 7 - Bengali: The main language of Bangladesh (population, 155 million) and one of India’s many official languages.
   * 8 - Russian and Japanese.

poedit

The best known tool is a program called poedit. poedit is a tool for translators. It produces both PO and MO as output.

Translating Resourcestrings

This is the way to store a resourcestring in an unit:

resourcestring
  Caption1 = 'Some text';
  HelloWorld1 = 'Hello World';

Resourcestrings are like normal string constants, that means you can assign them to any string:

Label1.Caption := HelloWorld1;

Compiling into .po files

Resourcestrings are compiled into .po files if you enable i18n in the Lazarus IDE. Go to Project > Project Options > i18n > Enable i18n. When you recompile your application the .po files will be updated. You may also select a directory where .po files will be stored, recommended po_files.

The default translation is recommended to be in English or the default language of your application, since it will be loaded if no other translation file is found.

When you have your project1.po copy and paste it and rename it to project1.es.po or one of the Language Codes of your choice. Then you will have 2 languages: English and Spanish. Then you must send to the translator the files for translation.

Your folder structure will look something like this:

project1\po_files\
project1\po_files\project1.po
project1\po_files\project1.es.po

Converting .po files to .mo files

When you have the translation finished in the .po file, you want to compile it to .mo to load it faster since is a binary format. To convert to .mo you can use poedit and go to File > Compile as .mo.

Automatic translation

When you have the .mo files ready, put them in the locale or languages folder right to your application executable, then include the unit DefaultTranslator and that's all. The translation will be done automatically.

uses
DefaultTranslator;

You want to distribute only the .mo files in the locale or languages directory, since .po files are usefull only for making the translation and compiling them into .mo.

Your folder structure will look something like this:

project1\project1.exe
project1\locale\
project1\locale\project1.mo
project1\locale\project1.es.mo

Testing translations

When you have everything ready you want to test if translations looks fine in your application for each language you have. Automatic translation has a feature that you can use in order to test each language quickly.

You must run your executable with the command line parameter --lang followed by the language code of your choice.

You will run your executable like this in order to test Spanish translation:

project1.exe --lang es

And you will see the translated application.

You can do this with the IDE. Go to Run > Run Parameters .... In that window in the input Command line parameters (without application name) write this:

--lang it

Then Run (F9) and you will see the translated application.

Final steps

To get everything translated you must include the LCL translations into your application locale folder.

Copy everything inside the folder C:\lazarus\lcl\languages to yout locale folder. Then you will have the LCL translated for your application.

Additional info

Date, time and number format

Under Linux, BSD and Mac OSX there are several locales defining things like time and date format or the thousand separator. In order to initialize the RTL you need to include the clocale unit in the uses section of your program (.lpr file).

gettext

The main technology involved in the process of translations is GNU gettext. FPC comes with the gettext unit.

uses
gettext;

PO

PO – Portable Object. This is the file that you receive back from the translators. It’s a text file that includes the original texts and the translations.

MO

MO – Machine Object. The MO file includes the exact same contents as PO file. The two files differ in their format. While a PO file is a text file and is easy for humans to read, MO files are compiled and are easy for computers to read. The unit gettext implements TMOFile and has several procedures to do the translation from .mo files, if you want to use it.

unit gettext;

...

TMOFile = class

...

  procedure GetLanguageIDs(var Lang, FallbackLang: string);
  procedure TranslateResourceStrings(AFile: TMOFile);
  procedure TranslateUnitResourceStrings(const AUnitName:string; AFile: TMOFile);
  procedure TranslateResourceStrings(const AFilename: String);
  procedure TranslateUnitResourceStrings(const AUnitName:string; const AFilename: String);

Everything else about translations

Here goes all the translations stuff you want to read if you want to make an advanced translation.

.po Files

There are many free graphical tools to edit .po files, which are simple text like the .rst files, but with some more options, like a header providing fields for author, encoding, language and date. Every FPC installation provides the tool rstconv (windows: rstconv.exe). This tool can be used to convert a .rst file into a .po file. The IDE can do this automatically. Some free tools: kbabel, po-auto-translator, poedit, virtaal.

Virtaal has a translation memory containing source-target language pairs for items that you already translated once, and a translation suggestion function that shows already translated terms in various open source software packages. These function may save you a lot of work and improve consistency.

Example of using rstconv directly:

rstconv -i unit1.rst -o unit1.po

Translating

For every language the .po file must be copied and translated. The LCL translation unit uses the common language codes (en=english, de=german, it=italian, ...) to search. For example the German translation of unit1.po would be unit1.de.po. To achieve this, copy the unit1.po file to unit1.de.po, unit1.it.po, and whatever language you want to support and then the translators can edit their specific .po file.

Note-icon.png

Note: For Brazilians/Portuguese: Lazarus IDE and LCL only have a Brazilian Portuguese translation and these files have 'pt_BR.po' extensions

IDE options for automatic updates of .po files

  • The unit containing the resource strings must be added to the package or project.
  • You must provide a .po path, this means a separate directory. For example: create a sub directory language in the package / project directory. For projects go to the Project > Project Options. For packages go to Options > IDE integration.

When this options are enabled, the IDE generates or updates the base .po file using the information contained in .rst and .lrt files (rstconv tool is then not necesary). The update process begins by collecting all existing entries found in base .po file and in .rst and .lrt files and then applying the following features it finds and brings up to date any translated .xx.po file.

Removal of Obsolete entries

Entries in the base .po file that are not found in .rst and .lrt files are removed. Subsequently, all entries found in translated .xx.po files not found in the base .po file are also removed. This way, .po files are not cluttered with obsolete entries and translators don't have to translate entries that are not used.

Duplicate entries

Duplicate entries occur when for some reason the same text is used for different resource strings, a random example of this is the file lazarus/ide/lazarusidestrconst.pas for the 'Gutter' string:

  dlfMouseSimpleGutterSect = 'Gutter';
  dlgMouseOptNodeGutter = 'Gutter';
  dlgGutter = 'Gutter';
  dlgAddHiAttrGroupGutter = 'Gutter';

A converted .rst file for this resource strings would look similar to this in a .po file:

#: lazarusidestrconsts.dlfmousesimpleguttersect
msgid "Gutter"
msgstr ""
#: lazarusidestrconsts.dlgaddhiattrgroupgutter
msgid "Gutter"
msgstr ""
etc.

Where the lines starting with "#: " are considered comments and the tools used to translate this entries see the repeated msgid "Gutter" lines like duplicated entries and produce errors or warnings on loading or saving. Duplicate entries are considered a normal eventuality on .po files and they need to have some context attached to them. The msgctxt keyword is used to add context to duplicated entries and the automatic update tool use the entry ID (the text next to "#: " prefix) as the context, for the previous example it would produce something like this:

#: lazarusidestrconsts.dlfmousesimpleguttersect
msgctxt "lazarusidestrconsts.dlfmousesimpleguttersect"
msgid "Gutter"
msgstr ""
#: lazarusidestrconsts.dlgaddhiattrgroupgutter
msgctxt "lazarusidestrconsts.dlgaddhiattrgroupgutter"
msgid "Gutter"
msgstr ""
etc.

On translated .xx.po files the automatic tool does one additional check: if the duplicated entry was already translated, the new entry gets the old translation, so it appears like being translated automatically.

The automatic detection of duplicates is not yet perfect, duplicate detection is made as items are added to the list and it may happen that some untranslated entries are read first. So it may take several passes to get all duplicates automatically translated by the tool.

Fuzzy entries

Changes in resource strings affect translations, for example if initially a resource string was defined like:

dlgEdColor = 'Syntax highlight';

this would produce a .po entry similar to this

#: lazarusidestrconsts.dlgedcolor
msgid "Syntax highlight"
msgstr ""

which if translated to Spanish (this sample was taken from lazarus history), may result in

#: lazarusidestrconsts.dlgedcolor
msgid "Syntax highlight"
msgstr "Color"

Suppose then that at a later time, the resource string has been changed to

  dlgEdColor = 'Colors';

the resulting .po entry may become

#: lazarusidestrconsts.dlgedcolor
msgid "Colors"
msgstr ""

Note that while the ID remained the same lazarusidestrconsts.dlgedcolor the string has changed from 'Syntax highlight' to 'Colors'. As the string was already translated the old translation may not match the new meaning. Indeed, for the new string probably 'Colores' may be a better translation. The automatic update tool notices this situation and produces an entry like this:

#: lazarusidestrconsts.dlgedcolor
#, fuzzy
#| msgid "Syntax highlight"
msgctxt "lazarusidestrconsts.dlgedcolor"
msgid "Colors"
msgstr "Color"

In terms of .po file format, the "#," prefix means the entry has a flag (fuzzy) and translator programs may present a special GUI to the translator user for this item. In this case, the flag would mean that the translation in its current state is doubtful and needs to be reviewed more carefully by translator. The "#|" prefix indicates what was the previous untranslated string of this entry and gives the translator a hint why the entry was marked as fuzzy.

Translating Forms, Datamodules and Frames

When the i18n option is enabled for the project / package then the IDE automatically creates .lrt files for every form. It creates the .lrt file on saving a unit. So, if you enable the option for the first time, you must open every form once, move it a little bit, so that it is modified, and save the form. For example if you save a form unit1.pas the IDE creates a unit1.lrt. And on compile the IDE gathers all strings of all .lrt files and all .rst file into a single .po file (projectname.po or packagename.po) in the i18n directory.

For the forms to be actually translated at runtime, you have to assign a translator to LRSTranslator (defined in LResources) in the initialization section to one of your units

...
uses
  ...
  LResources;
...
...
initialization
  LRSTranslator := TPoTranslator.Create('/path/to/the/po/file');

However there's no TPoTranslator class (i.e a class that translates using .po files) available in the LCL. This is a possible implementation (partly lifted from DefaultTranslator.pas in the LCL): The following code isn't needed anymore if you use recent Lazarus 0.9.29 snapshots. Simply include DefaultTranslator in Uses clause.

unit PoTranslator;

{$mode objfpc}{$H+}

interface

uses
  Classes, SysUtils, LResources, typinfo, Translations;

type

 { TPoTranslator }

 TPoTranslator=class(TAbstractTranslator)
 private
  FPOFile:TPOFile;
 public
  constructor Create(POFileName:string);
  destructor Destroy;override;
  procedure TranslateStringProperty(Sender:TObject; 
    const Instance: TPersistent; PropInfo: PPropInfo; var Content:string);override;
 end;

implementation

{ TPoTranslator }

constructor TPoTranslator.Create(POFileName: string);
begin
  inherited Create;
  FPOFile:=TPOFile.Create(POFileName);
end;

destructor TPoTranslator.Destroy;
begin
  FPOFile.Free;
  inherited Destroy;
end;

procedure TPoTranslator.TranslateStringProperty(Sender: TObject;
  const Instance: TPersistent; PropInfo: PPropInfo; var Content: string);
var
  s: String;
begin
  if not Assigned(FPOFile) then exit;
  if not Assigned(PropInfo) then exit;
{DO we really need this?}
  if Instance is TComponent then
   if csDesigning in (Instance as TComponent).ComponentState then exit;
{End DO :)}
  if (AnsiUpperCase(PropInfo^.PropType^.Name)<>'TTRANSLATESTRING') then exit;
  s:=FPOFile.Translate(Content, Content);
  if s<>'' then Content:=s;
end;

end.

Alternatively you can transform the .po file into .mo using msgfmt (isn't needed anymore if you use recent 0.9.29 snapshot) and simply use the DefaultTranslator unit

...
uses
   ...
   DefaultTranslator;

which will automatically look in several standard places for a .po file (higher precedence) or .mo file (the disadvantage is that you'll have to keep around both the .mo files for the DefaultTranslator unit and the .po files for TranslateUnitResourceStrings). If you use DefaultTranslator, it will try to automatically detect the language based on the LANG environment variable (overridable using the --lang command line switch), then look in these places for the translation (LANG stands for the desired language, ext can be either po or mo):

  • <Application Directory>/<LANG>/<Application Filename>.<ext>
  • <Application Directory>/languages/<LANG>/<Application Filename>.<ext>
  • <Application Directory>/locale/<LANG>/<Application Filename>.<ext>
  • <Application Directory>/locale/LC_MESSAGES/<LANG/><Application Filename>.<ext>

under unix-like systems it will also look in

  • /usr/share/locale/<LANG>/LC_MESSAGES/<Application Filename>.<ext>

as well as using the short part of the language (e.g. if it is "es_ES" or "es_ES.UTF-8" and it doesn't exist it will also try "es")

Translating at start of program

For every .po file, you must call TranslateUnitResourceStrings. The LCL po file is lclstrconsts. For example you do this in FormCreate of your MainForm:

uses
 ..., gettext, translations;

procedure TForm1.FormCreate(Sender: TObject);
var
  PODirectory, Lang, FallbackLang: String;
begin
  PODirectory := '/path/to/lazarus/lcl/languages/';
  GetLanguageIDs(Lang, FallbackLang);
  Translations.TranslateUnitResourceStrings('LCLStrConsts', PODirectory + 'lclstrconsts.%s.po', Lang, FallbackLang);

  // the following dialog now shows translated buttons:
  MessageDlg('Title', 'Text', mtInformation, [mbOk, mbCancel, mbYes], 0);
end;

Compiling po files into the executable

If you don't want to install the .po files, but put all files of the application into the executable, use one the following methods.

FPC Resources (Recommended)

Normal resources are now recommended for current FPC (including all recent Lazarus versions) Lazarus_Resources

  • Add the resources (.po files) to executable with the Lazarus IDE (Project Options > Resources) as RCDATA.
uses
LCLType

function Translate(Language: string): boolean;
var
  Res: TResourceStream;
  PoStringStream: TStringStream;
  PoFile: TPOFile;
begin
  Res := TResourceStream.Create(HInstance, 'project1.' + Language, RT_RCDATA);
  PoStringStream := TStringStream.Create('');
  Res.SaveToStream(PoStringStream);
  Res.Free;

  PoFile := TPOFile.Create(False);
  PoFile.ReadPOText(PoStringStream.DataString);
  PoStringStream.Free;

  Result := TranslateResourceStrings(PoFile);
  PoFile.Free;
end;

Lazarus Resources

  • Create a new unit (not a form!).
  • Convert the .po file(s) to .lrs using tools/lazres:
./lazres unit1.lrs unit1.de.po

This will create an include file unit1.lrs beginning with

LazarusResources.Add('unit1.de','PO',[
  ...
  • Add the code:
uses LResources, Translations;

resourcestring
  MyCaption = 'Caption';

function TranslateUnitResourceStrings: boolean;
var
  r: TLResource;
  POFile: TPOFile;
begin
  r:=LazarusResources.Find('unit1.de','PO');
  POFile:=TPOFile.Create(False);  //if Full=True then you can get a crash (Issue #0026021)
  try
    POFile.ReadPOText(r.Value);
    Result:=Translations.TranslateUnitResourceStrings('unit1',POFile);
  finally
    POFile.Free;
  end;
end;

initialization
  {$I unit1.lrs}
  • Call TranslateUnitResourceStrings at the beginning of the program. You can do that in the initialization section if you like.

Unfortunately this code will not compile with Lazarus 1.2.2 and earlier.

For these Lazarus versions you can use something like this:

type
  TTranslateFromResourceResult = (trSuccess, trResourceNotFound, trTranslationError);

function TranslateFromResource(AResourceName, ALanguage : String): TTranslateFromResourceResult;
var
  LRes : TLResource;
  POFile : TPOFile = nil;
  SStream : TStringStream = nil;
begin
  Result := trResourceNotFound;
  LRes := LazarusResources.Find(AResourceName + '.' + ALanguage, 'PO');
  if LRes <> nil then
  try
    SStream := TStringStream.Create(LRes.Value);
    POFile := TPoFile.Create(SStream, False);
    try
      if TranslateUnitResourceStrings(AResourceName, POFile) then Result := trSuccess
      else Result := trTranslationError;
    except
      Result := trTranslationError;
    end;
  finally
    if Assigned(SStream) then SStream.Free;
    if Assigned(POFile) then POFile.Free;
  end;
end;

Usage example:

initialization
  {$I lclstrconsts.de.lrs}
  TranslateFromResource('lclstrconsts', 'de');
end.

Cross-platform method to determine system language

The following function delivers a string that represents the language of the user interface. It supports Linux, Mac OS X and Windows.

uses
  Classes, SysUtils {add additional units that may be needed by your code here}
  {$IFDEF win32}
  , Windows
  {$ELSE}
  , Unix
    {$IFDEF LCLCarbon}
  , MacOSAll
    {$ENDIF}
  {$ENDIF}
  ;
function GetOSLanguage: string;
{platform-independent method to read the language of the user interface}
var
  l, fbl: string;
  {$IFDEF LCLCarbon}
  theLocaleRef: CFLocaleRef;
  locale: CFStringRef;
  buffer: StringPtr;
  bufferSize: CFIndex;
  encoding: CFStringEncoding;
  success: boolean;
  {$ENDIF}
begin
  {$IFDEF LCLCarbon}
  theLocaleRef := CFLocaleCopyCurrent;
  locale := CFLocaleGetIdentifier(theLocaleRef);
  encoding := 0;
  bufferSize := 256;
  buffer := new(StringPtr);
  success := CFStringGetPascalString(locale, buffer, bufferSize, encoding);
  if success then
    l := string(buffer^)
  else
    l := '';
  fbl := Copy(l, 1, 2);
  dispose(buffer);
  {$ELSE}
  {$IFDEF LINUX}
  fbl := Copy(GetEnvironmentVariable('LC_CTYPE'), 1, 2);
    {$ELSE}
  GetLanguageIDs(l, fbl);
    {$ENDIF}
  {$ENDIF}
  Result := fbl;
end;

Translating the IDE

Files

The .po files of the IDE are in the lazarus source directory:

  • lazarus/languages strings for the IDE
  • lazarus/lcl/languages/ strings for the LCL
  • lazarus/components/ideintf/languages/ strings for the IDE interface

Translators

  • The German translation is maintained by Swen Heinig.
  • The Finnish translation is maintained by Seppo Suurtarla
  • The Russian translation is maintained by Maxim Ganetsky
  • The French translation is maintained by Gilles Vasseur

When you want to start a new translation, ask on the mailing if someone is already working on that.

Please read carefully: Translating/Internationalization/Localization

See also