When we think of internationalization, we usually think of translating content in various languages. But it is so much more than that. Let's look at the following topics and how to solve all of them with one library.
Number formatting
How hard can it be? Well, decimal notations alone are a point of contention. Sometimes even within a country (looking at you, Canada)
light blue: dot, green: comma, teal: both, red: Arabic decimal symbol (U+2396)
Especially the Arabic notation is tricky as it looks like an apostrophe but isn't. This means that screen readers might misinterpret numbers. But even the most common differences can be confusing: in a scientific paper, translations of something like
... ergab eine Fehlerquote von 13,935%.
[German]
and
... showed an error rate of 13,935%.
[English]
would be a horrible mistranslation just based on the decimal point alone.
If we add currency to the mix, things become even weirder:
$9.99 or 9,99 $?
You see where I am going with this; in order to be truly localized, we have to rely on trustworthy resources to even have a chance to get it right. Thankfully, PHP has an extension for that called Intl
. But before you jump onto php.net to find out that you opened Pandora's box, worry not: there is an easier approach.
But first, let's look at another common issue:
Date, Time & localization
I am probably not the only one with the luxury of having meetings with people from various time zones. And applications have gotten good at it. Zoom & Co invite me for the correct localized time so I don't have to think about the pain of Europe not switching between daylight saving time on the same day or India having time zones that my brain refuses to calculate because they the difference can't even be expressed in full hours. As a developer, I need both, though. I need to format the date or time correctly for the user AND decide whether or not I want the context to be relative to the user or not.
An example:
My body still wasn't used to the sun at 1 am, so I had difficulties sleeping at the small hotel in Reykjavík I naively booked with a view of the northern lights in mind.
So sure, you want the international reader to "experience" 1 am in the suitable format, but not "translate" the time to the time zone our reader resides. On the other hand, if you take this example:
The event will take place virtually on Wednesday, August 20th at 11 pm PDT.
You most certainly will want to make sure that this information is accurately transcribed into whatever that actually means for the reader. As you can see, full control over the behavior is necessary.
Good ol' text translation
So, we talked about everything but actual translations. Besides the fact that as someone being fluent in more than one language, I don't believe in the word "translation" to begin with (It implies that meaning could be 100% transferable between languages). On a linguistic meta level, this can be shown on the following things:
German | English |
---|---|
0 Autos, 1 Auto, 2 Autos | 0 cars, 1 car, 2 cars |
0 Bakterien, 1 Bakterium, 2 Bakterien | 0 bacteria, 1 bacterium, 2 bacteria |
0 Informationen, 1 Information, 2 Informationen | information, information, information |
Let's look at row 1:
So in English, it seems to be the case that a singular is only used when we are talking about exactly ONE (And zero plural?!). And if we compare it to German, then this seems to be a thing for other languages as well. So in pseudo code, our logic would somehow have to account for that.
In row 2,
we are reminded of the fact that irregular plurals are a thing in many language, but you really start to get an insight into the nightmare in
row 3,
where we have to face the reality that things might be countable in some languages, but not in others. And don't forget, these languages are rather similar in comparison. Let's be a little mean:
Japanese | English |
---|---|
一枚 | one (e.g. when counting paper) |
一本 | one (e.g. when counting pencils) |
一頭 | one (e.g. when counting elephants) |
一杯 | one (e.g. when counting cups) |
... | ... |
Practicality
Lastly, we have to ask us how much time we have in order to offer as a solution to these problems, so many companies make use of AI-based translations that are either rather unreliable (like google translate) or extremely expensive. To mitigate this issue, and since PHP's Intl offers the toolset to address these issues, I wrote an i18n script that let's you write server-side rendered HTML with extended markup (template engine).
sroehrl / php-i18n-translate
Simple yet powerful i18n support for PHP
PHP i18n translate
Straight forward. Convenient. Fast.
Installation
composer require sroehrl/php-i18n-translate
require_once 'vendor/autoload.php';
$i18n = new I18nTranslate\Translate();
$i18n->setTranslations('de', [
'hello' => 'hallo',
'goose' => ['Gans', 'Gänse']
]);
Quick start:
1. In Code
echo "a: " . $i18n->t('hello') . "<br>";
echo "b: " . $i18n->t('goose') . "<br>";
echo "c: " . $i18n->t('goose.plural') . "<br>";
// detect plural by numeric value
foreach([0,1,2] as $number){
echo $number . " " . $i18n->t('goose', $number) . ", ";
}
Outputs:
a: hallo <br>
b: Gans <br>
c: Gänse <br
…It addresses our issues like such:
// little pseudo code, but you get it:
$productModelData = SomeORM->getProduct($_GET['productId']);
$t = new I18nTranslate\Translate($userLocale, $userTimezone);
$t = setTranslationsAndSomeSettings($t);
// I already apologize for the formatting
echo $t->translate(
Neoan3\Apps\Template\Template::embraceFromFile(
'/theHTMLbelowThisCode.html',
['product' => $productModelData]
)
);
<!-- A simple static translation using a t-tag -->
<h1><t>Welcome to our page!</t><h1>
<!-- B simple static translation using a template function -->
<h2>{{t('Welcome to our page')}}</h2>
<article>
<!-- C There's a lot going on here, we'll break it down later -->
<p><t>Check out [%product-name%](%{{product.title}}%)<t/></p>
<!-- D show USD price, but show in user's format -->
<div class="price" i18n-currency="USD">
{{product.price}}
</div>
<div class="special offer">
<p>
<t>Convincing text to make you buy in the next:</t>
<!-- E Yes, this works -->
<span i18n-time="m">+2 min</span> <br>
<!-- F Prints something like: Wednesday, 12.10.2022 10:30 Eastern Daylight Time -->
<span i18n-date-local="EEEE, dd.MM.Y HH:mm zzzz">
{{product.realease}}
</span>
</div>
<div class="very-special-offer">
<p>
<!-- G for "gettin' very dynamic here" -->
<t>Buy [%number%](% {{product.offerCount}} %) for only {{i18n-currency(product.discountedPrice * product.offerCount, 'USD')}} today!</t>
</p>
</div>
</article>
So I guess a little explanation wouldn't hurt?
As you noticed, I started each comment with a letter for our reference:
A - T-tag
In most cases, this will be enough. This tag runs after all other substitutions unless you additionally give the tag itself i18n-attributes. The outcome is a substitution of its content with the corresponding translation:
//...
$translations['de'] = [
'Welcome to our page!' => 'Willkommen auf unserer Seite!',
//...
];
//...
B - The T-function
The template engine uses curly brackets by default to evaluate content (You see this whenever we use data from "product").
Sometimes you want translations to be run earlier in the process to control interactivity between the data-context and the translation context. This is especially useful if you have custom functions and/or attributes for the templating. As this would reach beyond the scope of this article, I will have to leave examples to your imagination.
C - Placeholders
Sometimes (or quite often, depending on your project), you need to insert dynamic content into your translations. We best explore this by looking into the cycle of what happens under the hood:
//...
$translations['de'] = [
//...
'Check out [%product-name%]' => 'Erfahre mehr über [%product-name%]',
//...
];
//...
In our translations, we indicate dynamic values WITHIN the t-tag in order to account for placement differences or grammatical adjustments within languages. In our HTML-template, we then bind these placeholders to a value. As values could come from various sources, but we want to use the context of our product, we use curly brackets to indicate that we want to resolve the variable prior to substitution:
<t>Check out [%product-name%](% {{ product.title }} %)<t/>
1. Find and interpret context-data:
<t>Check out [%product-name%](%Scrapbook #2%)<t/>
2. Find and interpret functions
(not happening in this example)
3. Sanitize string & search for translation
// pseudo code to illustrate the principle
$memorize = "Scrapbook #2"
$lookFor = "Check out [%product-name%]";
$foundTranslation = 'Erfahre mehr über [%product-name%]';
4. Replace placeholder with value
// pseudo code to illustrate the principle
$final = str_replace('[%product-name%]', $memorize, $foundTranslation);
D - Attributes
We have several attributes at our disposal. Attributes are an easy way to hook static or dynamic values into i18n. In this case, we simply format the price to the user's expectations.
Pro-tip: i18n-translate uses the template engine's
addCustomAttribute
-method to achieve this. This means you can create your own attributes according to the needs of your shop/cummunity etc.
E - Another attribute
Some attributes need a value, others don't. In the case of i18n-time
both the format (here minutes without leading zero) and the value are optional. If the value isn't a timestamp, the attribute uses PHP's strtotime
, making this example (two minutes from now) possible.
Shows the current server time (but only minutes):
<!-- without content -->
<span i18n-time="m"></span>
Shows the current server time in the time-foramt suited for the user:
<!-- without content or format -->
<span i18n-time></span>
F - Make it local
As mentioned, we sometimes want to translate a time to the time zone of the user. That's why i18n-time-local
and i18n-date-local
output the given value in the user's time zone & -format.
G - Functions to prerender placeholders
All provided attributes are also available as functions. With the exception of the t-function, they translate themselves into placeholders:
- i18n-currency -> [%currency-value%]
- i18n-time, i18n-time-local -> [%time-value%]
- i18n-date, i18n-date-local -> [%date-value%]
- i18n-number -> [%number-value%]
This means, that our example would have the following translation
//...
$translations['de'] = [
//...
'Buy [%number%] for only [%currency-value%] today!' => 'Für nur [%currency-value%] erhälts du [%number%] Stück!',
//...
];
//...
Accessibility
Lastly, let's talk about screen-readers & co.
Using a server-side rendered solution is already a good start, but what else is there to consider?
The semantic side
In one of our examples we discussed using the t-funciton rather than the t-tag. This can also be useful if you don't want a semantic element to have a child (the t-tag). In other cases, you might want to leverage the existence of a tag to supply aria-declarations to your content. Since it is a tag, you can supply all regular attributes as you wish:
<t class="when-visually-impared:text-xxl" aria-role="note" data-id="x">Translate me!</t>
Extendability
We can extend the comfort using the functionality bundled with this package.
Let's say we have a JavaScript text-to-speech program we want to work with called "js-read-me" which reads from attributes:
Neoan3\Apps\Template\Constants::addCustomAttribute('js-read-me', function($domAttr, $contextData) use ($t){
$domAttr->nodeValue = $t->t($domAttr->nodeValue);
});
<button js-read-me="Listen to my rant by clicking this button">😤</button>
We now ensured that the content of "js-read-me" uses our translation!
Final thoughts
There is so much more we can achieve, but I mustn't overextend the time you spent on this already. I hope this serves as an inspiration and helps explain this complex yet important topic.
Thank you for making it this far & until next time!
Top comments (0)