The goal of Pontoon is to enable localization of as many website messages as possible. Sometimes, messages are hidden behind a hard to reach sequene, such as JavaScript warnings. We try to minimize the number of such messages by designing Pontoon to operate in two modes:

Vanilla mode. Pontoon can open any HTML document and guess which messages are suitable for localization. This mode is useful when the content manager does not possess access to the server-side document, or when the document is not generated by server-side code.

Aided mode. If content manager has access to the code running on the server, Pontoon can receive background information on messages by the hook, which enables a more complete localization of a document. Useful for websites, developed and maintained in-house.

Hooks are responsible for two things: identifying messages and providing their background information, e.g. original message, translation, suggestions from other users, etc. But how exactly do we implement them? I’ve been playing with this question a lot lately, so I’d like to give you an overview of my conclusions. I identified four different approaches:

<span>. Perhaps the most obvious choice is to wrap every message in a span tag (or em) and use HTML5 data-* attributes for storing background information. But the problem is that changing markup with span tags also means changing style of the website.

<span data-original="Original">Translation</span>

<nonstandardtags>. The quickest solution to the problem above is using non-standard tags. Except that it’s not! We are Mozilla, and using non-standard tags is not an option.

<l10n data-original="Original">Translation</l10n>

<!– Comment nodes –>. What about comment nodes? They do not affect the appearance, which is good, and they do not support data-* attributes, which is bad. We could simply ignore the latter and parse background information, but that sounds fugly.

<!-- l10n data-original="Original" -->Translation<!-- /l10n -->

External file. The best option I’ve found so far is the use of an external file with all the background information stored in a structued format, e.g. JSON. There’s no need for parsing and it doesn’t affect the appearance. I still use comment tags for message identification, but the idea is to get rid of them and use Xpath to match messages with JSON entities.

{"entities":
[
  ...
  {
    "original": "Become a Test Pilot!",
    "translation": "Wird Test Pilot!",
    "comment": "Test Pilot should not be translated.",
    "suggestions": [
      {
        "translation": "Werden Test pilot"
      },
      {
        "translation": "Werden Sie Testni Pilot"
      }
    ]
  },
  ...
]}

Of course the list does not stop here. I’d love to hear your suggestions! What do you think about hooks and how would you like to see them implemented?

Enhanced by Zemanta