WordPress global usage on the web is now at 23%, and this year marked the first time that non-English downloads surpassed the number of English downloads. Major internationalization improvements coming in 4.0 will open up the platform even more for those publishing in different languages.
While discussing the upcoming language-related improvements at WordCamp Seattle this year, Andrew Nacin highlighted the fact that only 5-10% of the world speaks English. It may not be long before the majority of WordPress installations are in Mandarin, Spanish, Hindi, or Arabic.
The need for better ways to support multilingual content is already a concern for many international users and agencies. One thing that WordPress core is currently missing is the ability to easily retrieve the language in which a post or page has been written. German WordPress developer Caspar Hübinger is in the early stages of creating a proposal to add a Post Language feature to core.
Why Does WordPress Need Post Language Support?
In outlining the need for post language support, Hübinger cites WordPress download stats from the end of April, demonstrating that 3.9 had been downloaded roughly 1.36 times more often in other languages than the default US English:
Total Core downloads: 6,589,287 (100%)
Default English: 2,807,978 (42.6%)
Others: 3,781,309 (57.4%)
(Data from April 29, 2014)
Hübinger wants to add post_language as a property of WP_Post just like post_author, post_excerpt, and the other variables.
“Offering a basic opportunity to users for them to store the language of their content along with other post meta information would provide a new level of empowerment for both, users and developers,” Hübinger contends.
His proposal is based on the premise that the language of post content serves as:
- a highly relevant piece of post meta information in general
- one of the most important parameters for plugin and theme developers to tackle the already complex field of language and translation
Many plugins, in the course of providing translation features, require the ability to determine the language a post was written in, but they all go about it in different ways. Portability is abysmal across plugins such as WPML, Polylang, Babble, Multilingual Press, and others that provide a similar functionality.
“All of those plugins, however, do much more than just determining the language of a post,” Hübinger told the Tavern. “They offer UIs for translating content and establishing language relationships between single posts — a field so complex that being built without any core method for language determination, each one of those plugins can become a major headache when a user tries to switch from one plugin to another.
“As a user you’re pretty much locked in to the solution you choose, since not only are connections between original posts and translations gone when you switch plugins, but also the very marker of which language a post is written in simply vanishes or becomes ineffective,” Hübinger explained. If WordPress had a standard way to determine the language in which a post was written, all of these plugins could potentially provide more portable functionality.
The Proposed Post Language Feature
So what would Post Language look like as a feature implemented in WordPress? In addition to providing developers with more tools to add custom language and translation features, post language would also allow users to assign a language selection in the Publish Post meta box:
Hübinger proposes that the select box be populated with the languages previously defined through either the language packs available within the given WordPress install, or a filter. The language selection would return the ISO code for that language and store it in a database field as post meta or an extra field that would have to be added to the database table.
The value for Post Language could then be used in the following ways:
- should be made accessible through template tags:
- should possibly affect
get_bloginfo( 'language' )
get_bloginfo( 'text-direction' )
- OR should be implemented via a new attribute on a per-post basis, similar to
<article <?php post_class(); ?> <?php post_language(); ?>> // ouput: <code><article class="foo bar" lang="en-US"></code>
Since not all WordPress sites would need this feature, he suggests that it be disabled by default and enabled via a constant, a filter or perhaps an admin setting under Settings > General.
Hübinger mentioned his idea in a comment on Andrew Nacin’s roadmap for 4.0 internationalization improvements, but he decided to wait until 4.0 is in place before officially proposing the feature. Adding a new property to WP_Post is a major consideration and will likely encounter a healthy debate.
Post Language Support Falls In Line with WordPress’ Mission to Democratize Publishing
Unlike various other CMSs, such as Drupal and Typo3, WordPress does not provide a core feature to publish translations of original content. “You can’t even just publish single posts in more than one language per site without messing up your markup with false language attributes,” Hübinger notes. “Not a problem? Try to get a machine reading a post to you in any other language than English when its markup says it is written in English. You’ll most certainly hear the problem.”
Hübinger believes that raising awareness is key for the Post Language feature to gain momentum. “Language on a per post basis is generally associated with translation in people’s minds, and rightfully so,” he said. “Translation, though, has always been an edge case scenario for our mainly anglophone WordPress core dev team, and rightfully so as well.” Convincing the WordPress community of the case for adding Post Language to core is the first step to making it a viable possibility.
The lack of a post language field juxtaposed with the existence of post formats in core is a continual source of bewilderment for Hübinger, who comes from a multilingual culture.
“I like to say if we have a visual carnival like post formats in core, it is high time to spend some thought on a language API which potentially will affect and benefit a couple of millions more users than fancy post formats,” he said. “Nothing against post formats; I like them. They just make such good contrast when comparing the importance of core features.”
His proposal makes a compelling case for the international community and appeals to the heart of WordPress’ core mission to democratize publishing.
After all, WordPress is all about publishing content, and content inevitably has to do with language. We can’t honestly claim to ‘democratize publishing’ while we continue to ignore the relevance of linguistic aspects regarding content for WordPress users around the world.
Hübinger believes that a Post Language feature can help the project enter a higher level of maturity with one small API feature addition. “While the whole field of translating and multilingual content rightfully has been and will be outsourced into plugin territory, WordPress core needs to provide at least a basic language-per-post API for plugin authors to work with, thus preventing users from locking themselves in with one solution forever,” he said.
Hübinger readily admits that the feature is beyond his coding capabilities and hopes that other developers will join the effort to establish a path for architecture and implementation.
“I am totally open to any self-respecting developers who would like to contribute, fork the repo, set up their own one for the same idea,” he said. “This is about making WordPress better for millions of non-anglophone users, so let’s just get that language API in there in the most decent manner possible!”
Once WordPress 4.0 is released with improved multilingual support, Hübinger hopes to drum up more support and contributors to work on the project before officially proposing it to core. If you’d like to assist on further developing the Post Language proposal, you can find the project on GitHub.