New Plugin Uses Microsoft’s Computer Vision API to Automatically Fill in Alt Text for Images

When uploading images to WordPress’ media library, there are a couple of different fields available to provide additional information. You can edit an image’s title, add a caption, provide alternative text, and add a description.

The Alt Text field displays text when an image does not display. According to Morten Rand-Hendriksen, these fields are underutilized and largely ignored. There are two primary reasons to add alt text to images.

  1. Search engines use them when they index images.
  2. Improves accessibility, especially to those with sight impairments.

Hendriksen suggests that every image should have alternative text unless it’s only used for decoration. However, adding alt text is yet another task in the publishing process.

Connecting Automatic Alternative Text to Microsoft’s Computer Vision API

Thankfully, there’s a new plugin available that fills in the alt text field automatically. It’s called Automatic Alternative Text and is developed by Jacob Peattie.

Peattie was inspired to create the plugin after responding to a thread in the WordPress support forums where a user asked if something was available that can automatically rename image files.

Once the plugin is installed and activated, you’ll need to visit Microsoft’s Cognitive Service site and register a new account in order to obtain API access keys. The API keys allow for up to 5K transactions per month and 20 transactions per minute.

MicrosoftCognitiveSearchAPIKeys
Microsoft Cognitive Search Free Subscription Plan

Once an API key is obtained, you’ll need to enter it at the bottom of the Settings > Media page. This is where you can also configure the amount of confidence needed before the API will automatically add text.

AutomaticAltTextSettingsPageAutomatic Alternative Text uses Microsoft’s Cognitive Services Computer Vision API that’s part of a collection of Cognitive Services offered by Microsoft.

Through machine learning and algorithms, Microsoft’s API is able to identify, emotions, faces, ages, genders, and more from images. You can see how it works by uploading an image to the Computer Vision demo site.

Once connected to the Computer Vision API, each image that’s uploaded to the media library is analyzed and results are displayed within seconds. I tested the plugin using a few different images and while the results are mostly accurate, there is room for improvement.

Alcohol Featured Image
photo credit: Cannery Row Brewing Company (license)

For the image above, the alt text that was generated is Person sitting at night. It’s tough to determine the time of day but most bars are open during the night. It also looks like the image was taken by someone sitting at the bar. Therefor, I think the alt text is somewhat accurate.

Tweetstorm Featured Image
Tormenta explosiva – Explosive storm

For the image above, the API generated the following alt text, A cloudy sky. I think this is one of the easier photos to get right and though it’s specifically a storm cloud, cloudy sky is an accurate description.

typewriterFor the image above, the API generated the following alt text, A black remote control sitting on a table. This is a great example that illustrates why cognitive services can not be completely relied upon. It’s a red typewriter, not a controller and it’s sitting on a black couch, not a table.

It’s Not Perfect but It Provides a Great Head Start

While my test is limited to a few images, I’m impressed by how accurate the service is in identifying key components and using them as the alt text. It’s not perfect, and I don’t think it ever will be, but it gives users an excellent head start with filling in the alt text field.

One drawback to Automatic Alternative Text is that you can’t submit images that already exist in the media library to the API. Providing an easy way to do this might be enough to motivate users to check their media library and add missing alt text to images.

Outside of the confidence threshold, there are no settings to configure and the plugin works as expected on WordPress 4.7 Alpha. It’s available for free from the WordPress plugin directory and if you decide to test this plugin on your sites, please come back and let us know how accurate the API is for your images.

12

12 responses to “New Plugin Uses Microsoft’s Computer Vision API to Automatically Fill in Alt Text for Images”

  1. Oh hey, my plugin’s on WP Tavern! Thanks Jeff!

    I think you’ve definitely hit the nail on the head regarding not completely relying on cognitive services, but I hope at the very least this plugin helps people think more about using alt text in a way that’s helpful to low vision users, and I’m glad that you were able to get some accurate descriptions.

    Regarding adding text to existing images in the media library, that’s definitely something I want to tackle in future versions. At first this will probably take the form of a button on individual images to receive alt text. Regarding bulk updating previous images, however, there are many challenges:

    1. What does the interface look like? Does the processing happen in the background?
    2. How do I process images without hitting request limits?
    3. Updating alt text in the Media Library won’t update images already added to posts/pages. How do I make this clear? Do I attempt to update content?

    As you can imagine, these are tricky problems, and a bit too much for a 1.0, but I’m game to tackle them in a future version.

    The very next version however, will be dedicated solely to fixing that “Confidence” typo on the settings screen :|.

    • Fortunately, there’s an even simpler way to add alternative text to your images. Manually add the appropriate text to the alternative text field when adding the image to your post. Humans will always be the best at determining the contents of an image, as well as determining how to describe that image’s contents. Despite all of this, I’m glad this plugin exists. Appropriate alternative text for images requiring it is probably the lowest hanging fruit when it comes to accessibility, and alternative text is probably the most neglected part of making a website accessible. I should go propose this as a feature project.

  2. In the short-term where the results are less accurate that what a human might provide there is a risk of people relying on this type of technology and discounting the need for human intervention.

    While an AI may never be as good as someone highly skilled at writing alt-text I believe that it will one day be as good or better than the average human.

Newsletter

Subscribe Via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.