Whether you are building a website or a full-fledged web application, making it accessible to a wider audience often requires it to be available in different languages and locales.
Fundamental differences between most human languages make this anything but easy. The differences in grammar rules, language nuances, date formats, and more combine to make localization a unique and formidable challenge.
Consider this simple example.
Rules of pluralization in English are pretty straightforward: you can have a singular form of a word or a plural form of a word.
In other languages, though – such as Slavic languages – there are two plural forms in addition to the singular one. You may even find languages with a total of four, five, or six plural forms, such as in Slovenian, Irish, or Arabic.
The way your code is organized, and how your components and interface are designed, plays an important role in determining how easily you can localize your application.
Internationalization (i18n) of your codebase, helps ensure that it can be adapted to different languages or regions with relative ease. Internationalization is usually done once, preferably in the beginning of the project to avoid needing huge changes in the source code down the road.
Once your codebase has been internationalized, localization (l10n) becomes a matter of translating the contents of your application to a specific language/locale.
Localization needs to be performed every time a new language or region needs to be supported. Also, whenever a part of the interface (containing text) is updated, new content becomes available – which then needs to be localized (i.e., translated) to all supported locales.
In this article, we will learn how to internationalize and localize software written in PHP. We will go through the various implementation options and the different tools that are available at our disposal to ease the process.
One of the most classic tools (often taken as reference for i18n and l10n) is a Unix tool called Gettext.
Though dating back to 1995, it is still a comprehensive tool for translating software that is also easy to use. While it is pretty easy to get started with, it still has powerful supporting tools.
Gettext is what we’ll be using in this post. We will be presenting a great GUI application that can be used to easily update your l10n source files, thereby avoiding the need to deal with the command line.
There are some cases, in big projects, where you might need to separate translations when the same words convey different meaning in different contexts.
In those cases, you’ll need to split them into different “domains,” which are basically named groups of POT/PO/MO files, where the filename is the said translation domain.
Small and medium-sized projects usually, for simplicity, use only one domain; its name is arbitrary, but we will be using “main” for our code samples.
In Symfony projects, for example, domains are used to separate the translation for validation messages.
A locale is simply a code that identifies one version of a language. It’s defined following the ISO 639-1 and ISO 3166-1 alpha-2 specs: two lower-case letters for the language, optionally followed by an underscore and two upper-case letters identifying the country or regional code.
For rare languages, three letters are used.
For some speakers, the country part may seem redundant. In fact, some languages have dialects in different countries, such as Austrian German (de_AT) or Brazilian Portuguese (pt_BR). The second part is used to distinguish between those dialects – when it’s not present, it’s taken as a “generic” or “hybrid” version of the language.
As we said in the introduction, different languages might sport different pluralization rules. However, Gettext saves us this trouble.
When creating a new .po file, you’ll have to declare the pluralization rules for that language, and translated pieces that are plural-sensitive will have a different form for each of those rules.
When calling Gettext in code, you’ll have to specify a number related to the sentence (e.g. for the phrase “You have n messages.”, you will need to specify the value of n), and it will work out the correct form to use – even using string substitution if needed.
Gettext will determine which rule to use based on the number provided and will use the correct localized version of the string. For strings where pluralization needs to be handled, you will need to include in the .po file a different sentence for each plural rule defined.
Make Your PHP App Multilingual With Gettext
Gettext is a very powerful tool for internationalizing your PHP project. Beyond its flexibility that allows support for a large number of human languages, its support for more than 20 programming languages allows you to easily transfer your knowledge of using it with PHP to other languages like Python, Java, or C#.
Furthermore, Poedit can help smooth the path between code and translated strings, making the process more straightforward and easier to follow. It can also streamline shared translation efforts with its Crowdin integration.
Whenever possible, consider other languages your users might speak. This is mostly important for non-English projects: You can boost your user access if you release it in English as well as your native language.
Of course, not all projects have a need for internationalization, but it’s much easier to start i18n during a project’s infancy, even if not initially needed, than it is to do it later down the road should it subsequently become a requirement. And, with tools like Gettext and Poedit it is easier than ever.