ICU syntax crash course

This library analyzes and compiles the translations authored in the ICU message syntax. While the ICU message syntax is an independent project this is an accelerated course on why it's good and how to use the main features.

Why use the ICU message syntax?

ICU stands for International Components for Unicode. While its popularity begun in C/C++ and Java, it's the javascript ecosystem where it has become the defacto standard for internationalization, although it's also popular in Python and PHP.

Internationalizing apps is a whole lot more than just mapping some keys to the appropriate translated string in a dictionary. Properly internationalized apps must handle all aspects of translation, including the way dates and times are formatted, what delimiters are used in numbers to separate the thousands or the decimals, currencies and support gendered languages.

Even something as simple as plurals can get very complex depending on the language. English, German and Spanish have singular and plural, but some slavic languages have 3, and other languages like Arabic have 6 depending on the number of items being pluralized. Sometimes the threshold where we have to change from one plural form to the next varies depending on the regional variant.
English doesn't have many gendered words but French or Italian do, and the adjectives must match the noun's gender. Formatting 123456789 in the US english variant will result in 123,456,789 but in the Indian variant will be 12,34,56,789.
Formatting currencies the $ symbol goes before the amount, but the € goes after.

The ICU syntax abstracts all this complexity from the developers and gives the real professional translators a meta language expressive enough to handle all the subtleties on they side.

Interpolations

ICU messages support interpolating values, which will be properly sanitized so passing undefined will not interpolate as "undefined".

Entry Entry Output
Your favorite color is {chosen}
Your favorite color is orange

Plurals

The second most used feature in any app is pluralization. The ICU syntax has a dedicated plural helper to define plural translations that from very simple to quite complicated, all within the translation itself.

Each path for the plural is prefixed with the numeric qualifier. The possible qualifiers are:
  • zero
  • one (singular)
  • two (dual)
  • few (paucal)
  • many (Also used for fractions)
  • other (general plural form. The one used on languages with only one plural)
Lets see some examples first:
Entry Values Output
Your have {numCats, plural, one {one cat} other {# cats}}
Your have 0 cats
Your have {numCats, plural, 
  =0 {no cats at all} 
  =1 {one single cat} 
  =2 {a couple cats} 
  =3 {a trio of cats} 
  =12 {a dozen cats} 
  other {exactly # cats}}
Your have no cats at all
Mary {guestCount, plural, offset:1 
  =0 {does not give a party.} 
  =1 {invites {guest} to her party.} 
  =2 {invites {guest} and one other person to her party.} 
  other {invites {guest} and # other people to her party.}}
Mary does not give a party.

Some languages like English only leverage one and other but others will be able to use the best plural form. The particular threshold value that divides few and many is heavily cultural.

You can also specify translations for exact values using =N. When a number is specified that way that translation will supersede the language's default behavior.
For instance, in english you could use =2 or =12 to have different translations specifically for a couple and a dozen instead of using the generic plural.

Lastly, plurals can also make us of the hashtag to print as number the value being used in the plural, and optionally the helper can receive an offset that will be substracted to the value in the hashtag.

Select

The select helper is used to choose among several translation paths depending on an argument.
While it has many possible uses the most common one is for having gendered translations.

Entry Values Output
Your {childGender, select, male {son} female {daughter} other {child}} has won an award
Your son has won an award

Dates

This helper is used to format dates according to the current locale one of the default formats or the custom ones you added when configuring the app.
The default format are:
  • short: The most compact date representation
  • medium: Abbreviated textual representation
  • long: Long textual representation
  • full: The most verbose and complete date
Entry Values Output
Your next holidays start on {holidayStart, date, full}
Your next holidays start on Thursday, October 10, 2024

Times

Just like the date helpers but for formatting only the time part of a date.

Entry Values Output
Your doctor's appointment is today at {appointment, time, short}
Your doctor's appointment is today at 4:30 AM

Numbers

Formats a number according to the rules of the current locale.

Entry Values Output
Your account balance is {num, number}
Your account balance is 0

There's also an advanced feature called Number Skeletons that allow you to customize to great lengths how you want your numbers formatted.

Entry Values Output
Your account balance is {num, number, ::currency/EUR}
Your account balance is €0.00
Your account balance is {num, number, ::currency/EUR sign-always}
Your account balance is +€0.00
Game progress {num, number, ::percent}
Game progress 0%
Game progress {num, number, ::percent .00}
Game progress 0.00%
Game progress {num, number, ::percent .00 scale/100}
Game progress 0.00%
Your destination is {num, number, ::unit/meter} away
Your destination is 0 m away
Your destination is {num, number, ::unit/meter unit-width-full-name} away
Your destination is 0 meters away
Are you sure you want to bid {num, number, ::K} over asking?
Are you sure you want to bid 5K over asking?
Are you sure you want to bid {num, number, ::KK} over asking?
Are you sure you want to bid 5 thousand over asking?
The chances of winning the lottery are 1 in {num, number, ::scientific/*ee}?
The chances of winning the lottery are 1 in 1.235E8

The possibilities of number skeletons are limitless.