One year ago developers at Google and Yoast began collaborating with other contributors on a proposal to add XML sitemaps to WordPress core. The XML Sitemaps feature plugin went into testing in late January and the feature is now on deck for inclusion in WordPress 5.5.
This week contributors merged a basic version of sitemaps that plugin developers can either build on or disable.
“This core sitemaps feature aims to provide the base required functionality for the Sitemaps protocol for core WordPress objects, then enables developers to extend this functionality with a robust and consistent set of filters,” Google engineer Pascal Birchler said in the merge announcement.
Millions of WordPress sites have already implemented sitemaps using an SEO plugin or a dedicated sitemaps plugin. Plugin authors are encouraged to re-architect their solutions to work with the core sitemaps protocol, but users do not have to worry about conflicts. Birchler said he expects many users will no longer need additional plugins to meet their sitemap needs.
“If for some reason two sitemaps are exposed on a website (one by core, one by a plugin), this does not result in any negative consequences for the site’s discoverability,” Birchler said.
Although native XML sitemaps have received a mostly favorable response from the community and WordPress’ leadership, there are some who believe this functionality would be better left to plugins. Fortunately, there’s an easy way for anyone who is concerned to turn it off. Users who don’t want sitemaps activated can change WordPress’ settings to discourage search engines from indexing the site. Developers can disable it using a filter.
The basic sitemaps implementation does not include any UI controls for further customization, such as excluding certain posts or pages. Birchler explained that this was not part of the scope of the project. The plugin ecosystem will still have plenty of latitude in addressing more complex sitemap requirements:
User-facing changes were declared a non-goal when the project was initially proposed, since simply omitting a given post from a sitemap is not a guarantee that it won’t get crawled or indexed by search engines. In the spirit of “Decisions, not options”, any logic to exclude posts from sitemaps is better handled by dedicated plugins (i.e. SEO plugins). Plugins that implement a UI for relevant areas can use the new filters to enforce their settings, for example to only query content that has not been flagged with a “noindex” option.
Performance was one of the chief technical concerns when the project was initially proposed, particularly in response to the number of URLs per page and the
lastmod date in the
index.xml file. Contributors landed on capping the URLs per sitemap at 2,000. The solution for the
lastmod date that they implemented adds a cron task that runs twice daily, fetches the
lastmod dates of each sitemap, and stores them in the options table. [Update: The
lastmod date was removed during development because community feedback indicated this additional property provided no clear benefit.]
“The addition of this feature [core sitemaps] does not impact regular website visitors, but only users who access the sitemap directly,” Birchler said. “Benchmarks during development of this feature showed that sitemap generation is generally very fast even for sites with thousands of posts. Thus, no additional caching for sitemaps was put in place.”
More information about extending core sitemaps is available in the merge announcement, along with FAQs. This feature is expected to be released with WordPress 5.5 in August.
I’ve already replaced the code in one of my plugins with this new feature and created hooks to allow enabling/disabling post types and users sitemaps.