L10n:Locale Codes
Main | Join Mozilla | Overview | L10n Drivers | Communities | Meetings | Blog | Resources
After recent discussions, MLP staff decided on using an extended scheme for locale names, following the "language tag" RFC 5646.
This means that in addition to the previous style of ab-CD locale names, we also support simple language-only names, and even extended names for dialects.
The basic for of our new locale identifiers is <language>-<region>-<dialect>, where the region and dialect parts are optional.
Actually, every language that's not different for different regions should go
with the ISO 639.1/.2 (2-letter/3-letter) language code alone ("de",
"eo", "pl", "cs", etc.), while all where the region does matter should include
it (2-letter uppercased ISO 3166 code; locale strings look like those we have used until
now: "es-ES", "es-AR", "pt-PT", "en-US"). In some rare cases, we might need the
dialect part as a third part (3- to 8-letter basically freeform part), we
currently can imagine two cases there:
1. there's no ISO 639.2 code for some language that wants to do a localization (e.g. for Venetian Firefox team). In this case, we can use the generic identifier for the language family (romance: roa) from ISO 639.2 as the language code, and add an identifier for the specific language as the dialect (if one exists, we prefer to use the 3-letter SIL code). In the case of venetian, we end up with "roa-IT-vec" this way.
2. we have a real dialect, e.g. a Bavarian L10n, which would get "de-DE-bavarian" or something similar (dialect identifier has to be at least 3 and at most 8 characters).
To summarize, these schemes are supported:
- ab
- ab-XY
- abc-XY
- abc-XY-SIL
- abc-XY-dialect
- ab/abc - from ISO 639.1/639.2
- XY - from ISO 3166
- SIL - from the SIL list
- dialect - following the rules from RFC 5646 (5 to 8 letters; or a number followed by three letters or numbers)