Customers who have been following the Pressable Status blog received no reassurance this week regarding the current outage. The Pressable team is currently working around the clock to resolve the issues causing customer websites to go down. The status blog states: “We do not currently have and will not likely provide an ETA in this situation. The best thing to do is to keep checking the current status at the bottom of this post.”
The current outage comes on the heels of last week’s outage, for which CEO Vid Luther apologized on the company’s blog. Some customers reported 24+ hours of downtime. Pressable has been flooded with help desk requests, angry tweets, and emails. For the past two weeks the company has been hemorrhaging customers faster than it can repair the servers.
What’s Happening at Pressable?
Recent communications on the Pressable blog have left customers confused about the root of the incidents. The status post cites a litany of compounding problems, i.e. issues with caching servers, internal bandwidth limitations on database servers, limitations on the rates at which servers can be added, an isolated cluster that was causing trouble for the others, etc.
I spoke with Vid Luther to get a better understanding of what is happening behind the scenes at Pressable. From the outside, it appears that the company has a lack of infrastructure to accommodate the current customer load, but Luther said it’s much more than that:
The answer to this is complicated, it depends on your understanding of technology, business, and the WordPress eco-system. We are not lacking in terms of hardware or network capacity; we are short on the number of employees we have in comparison to the number of customers we have. Our entire team consists of 5 people, most people are usually amazed to learn about what we’ve accomplished as such a small team. But, when you have such a disparity in terms of employee to customer ratio, communication in a time of crisis like this suffers.
Over the past several weeks, the company has had all hands on deck to fix the problems, but customers have commented on the lack of transparency and Luther’s silence during the incidents.
“I would like to apologize for not having a better communication strategy. Hopefully, others can learn from this, and plan for it accordingly,” Luther told the Tavern.
“But, having a great communication plan doesn’t work for very long, eventually, you have to fix the problem for good,” he said. “That is what we’ve been working on. Over the past 12 months, we’ve had issues, and we’re tired of apologizing. I thought it would be best for us to deliver the solution instead of saying sorry once again.”
A Long-Standing Problem with Infrastructure
Customers have pointed out that while their websites have gone down, the Pressable site remains in tact. “This is because our website along with several thousand others, are already in our new infrastructure,” Luther explained.
“The new infrastructure has much better underpinnings, not just from a raw horse power perspective, but it’s been designed with situations like this in mind. I would say it’s probably one of the more advanced configurations out in the WordPress hosting market.”
In Luther’s post to the company blog regarding the previous outage, he mentions that the company anticipated this kind of problem last summer.
Fortunately, this is something that wasn’t completely unanticipated, we had identified this as a potential issue last summer, and had been working on upgrading our systems over the next two months.
What happened to halt the migration to the new infrastructure? Luther attributes it to an error in judgment.
The current situation is one of several scenarios we identified last summer, and then we ranked them in order of impact to customers, and probability of it actually happening. But, as you know Murphy’s law applies to all situations and people, and it applies here. We anticipated an event like this, and we designed a solution to address it, we were so busy building the new solution, we didn’t think about putting some safe guards on the old infrastructure. This was an error in judgement. I am to blame for it.
The root of the issue here is that our old infrastructure had a very large impact radius, and we didn’t migrate people fast enough after we had identified it.
Luther recognizes that the recent outages have had an impact on the business, as many customers are looking for alternative hosting solutions. He said that the team has ideas to help mitigate the losses once the situation is stable, but they aren’t ready to share those at this time.
“First we want to make the current system stable again, then we’ll work with the affected customers and do what’s right by them,” he said.
The five-person Pressable team is currently stretched thin and working overtime. Luther encourages customers to remember that there are human beings working tirelessly behind the servers and technology.
We’re exhausted, we’ve got pregnant wives, parents who’ve suffered multiple strokes, and some of us are still reeling from a divorce, we’re human, we’re juggling too many things at once, and we know we shouldn’t be, but we don’t know how to just stop. The tweets, the comments, and general treatment by customers and competitors has been a brutal reminder of what it is to be a human. Could we have done things diferently? Absolutely.
The hosting business and the technology and infrastructure behind it are complex. Last year, WP Engine, a much larger company that received $15 million in funding in 2014, had to address critics following a damaging exposé of its customer support. Eventually, every successful host will encounter the challenge of keeping pace with its own growth. Engineering customer happiness following unreliable service is an equally challenging endeavor.
Pressable is cooking up strategies for regaining consumer confidence following the recent incidents, but the first order of business is to resolve the issues surrounding the current outage. This morning the company opened up a room on its Hipchat account to add another line of communication. For now, customers have no choice but to ride out the storm and watch the Pressable Status blog for updates.