Internationalization#

This section covers how to internationalize (I18n) and localize (L10n) the PyData Sphinx Theme. For details on how to localize the configurable strings in your Sphinx project, see the User guide section on internationalization.

The PyData Sphinx Theme I18n/i10n workflows are based on GNU Gettext and pybabel is used to keep the catalogs up to date. Crowd-sourced localizations are managed on Transifex.

Note

Internationalization (or I18n) is the process of making a program or application aware of and able to support multiple languages.

Localization (L10n) is the process of translating localized programs or applications into different languages while ensuring that the translations are correct for some native language and cultural habits.

The formal description of specific set of cultural habits for some country, together with all associated translations targeted to the same native language, is called the locale for this language or country. For more information about localization and internationalization see GNU gettext concepts.

The general process for internationalizing and localizing the theme is as follows:

  1. Mark strings in the theme as localizable.

  2. Extract localizable strings to a message catalog template POT (PO template file).

  3. Generare a PO file from the POT file for a new language (locale) you want to localize (or update existing localization files).

  4. Compile the message catalogs to binary MO files.

  5. Localize the theme.

Localization files#

There are three types of files used in the localization process:

  1. PO files ( Portable Object, also known as message catalogs) associate each original, translatable string (defined in msgid) of a given program with its translation in a particular target language (defined in msgstr fields). A single PO file is dedicated to a single target language.

  2. MO (Machine Object) files are a binary version of a PO file.

  3. POT (Portable Object Template) files are similar to PO files, but with empty translations. They are used as a template for new languages.

Marking strings as localizable#

All natural language text in the theme’s components and layouts must be marked as localizable so that they can be later translated (or localized) into other languages. For example, if you add a button with the text Next page, you will need to mark this text as localizable.

HTML templates (located in the src/pydata_sphinx_theme/theme/ directory). To do so, you can use the Jinja2 trans block and/or a _() function to mark text as localizable in the corresponding

For example, to mark the text Next page as localizable, you would write:

<button type="button">
   {{- _("Next page") -}}
</button>

Any text that is marked in this way will be discoverable by pybabel and used to generate the localization files (PO files).

Once you’ve marked the text as localizable, complete the steps outlined in the Updating the localization files section of this documentation.

For more details on marking strings as localizable in jinja templates visit the Jinja2 documentation.

Tip

Remember to manually escape variables if needed.

Sometimes, it can help localizers to describe where a string comes from or whether it refers to a noun or verb, particularly if it can be difficult to find in the theme, or if the string itself is not very self-descriptive (for example, very short strings). If you immediately precede the string with a comment that starts with L10n:, the comment will be added to the PO file, and visible to localizers. For example:

{# L10n: Navigation button at the bottom of the page #}
<button type="button">
   {{- _('Next page') -}}
</button>

Updating the localization files#

When you add or change natural language text in the theme, you must update the message catalogs to include the new or updated text. Follow these steps:

  1. Edit the natural language text and ensure it is marked as localizable.

  2. Extract the strings and update the localization files (POT file):

    # note this will by default update all the localization files for all the supported locales
    tox run -e i18n-extract
    

This will update the localization files with new information about the position and text of the language you have modified.

If you only change non-localizable text (like HTML markup), the extract command will only update the positions (line numbers) of the localizable strings. Updating positions is optional - the line numbers are to inform the human translator, not to perform the localization. But it is best practice to keep the positions up to date.

If you change localizable strings, the above command will extract the new or updated strings to localization template file (POT) and perform a fuzzy match between the new or updated strings and existing localizations in the localization files. If there is a fuzzy match, a comment like #, fuzzy is added before the matched entry, this means that the localization needs to be manually reviewed and possibly updated. If after reviewing the localization you decide to keep the existing localization, you can remove the #, fuzzy comment from the entry. If there is no fuzzy match, it will add a new localization entry. You can learn more about fuzzy entries in the GNU gettext manual.

Compiling the localization files#

Gettext doesn’t parse any text files, it reads a binary format for faster performance. To compile the latest PO files in the repository run:

tox run -e i18n-compile

You can also run the extract, update and compile commands in one go:

tox run -m i18n

This will update the localization files and compile them into binary MO files in a single step. However, if there are fuzzy matches needing review, the compilation will fail, and you will need to review the localizations manually. Then compile the files again.

Adding a new language#

The list of languages with existing (possibly incomplete) localizations is available in the src/pydata_sphinx_theme/locale directory.

To add a new language, follow these steps:

  1. Identify the ISO 639-1 code for the new language.

  2. Generate a PO file based on POT file for this new language:

    # for example, to add Quechua (ISO 639-1 code: qu)
    tox -e i18n-new-locale -- qu
    
  3. Update and compile the localization files as described in the Updating the localization files and Compiling the localization files sections. Then commit the changes.

  4. Localize the theme’s into the newly added language (see Localizing the theme).

Localizing the theme#

We manage localizations on the PyData Sphinx Theme project on Transifex.

To contribute localization, follow these steps:

  1. Sign up for a Transifex account.

  2. Join the PyData Sphinx Theme project.

  3. Select the language you want to localize. If the language you are looking for is not listed, you can open an issue on GitHub to request it. Then you can open a pull request to add the new language following the steps outlined in Adding a new language.

  4. Now you are ready to start localizing the theme. If you are new to Transifex you can visit the Transifex documentation for more information.

Once you have completed your localization, the PyData Sphinx Theme maintainers will review and approve it.

Localization tips#

Localize phrases, not words#

Full sentences and clauses must always be a single localizable string. Otherwise, you can get next page localizated as suivant page instead of as page suivante, etc.

Dealing with variables and markup in localizations#

A localizable string can be a combination of a fixed string and a variable, for example, Welcome to the Spanish version of the site is a combination of the fixed parts Welcome to the and version of the site and the variable part Spanish.

{% trans language=language %}
Welcome to the {{ language }} version of the site
{% endtrans %}

Binding the variable as language=language ensures the string can be properly localized, especially as the word order may vary across locales. The above string will be extracted as Welcome to the %(language) version of the site. The translator must use %(language) verbatim while localizing the theme.

When a block contains HTML with attributes, those which don’t need to be localized should be passed as arguments. This ensures strings won’t need to be re-localized if those attributes change:

{% trans url="https://pydata.org/" %}
 Please visit <a href="{{ url }}" title="PyData website">the PyData website</a> for more information.
{% endtrans %}

References#

I18N and L10N are deep topics. Here, we only cover the bare minimum needed to fulfill basic technical tasks. You might like: