Over the weekend, Bluehost experienced a severe, widespread network issue that caused customer sites to go down. The incident began Friday evening and continued into the night. As WP Tavern is hosted on Bluehost, we were watching the situation with keen interest, finally clocking the downtime at 12 hours before our site was back up.
We are seeing problems w/ network flapping which could be caused by denial of service or other network issues in our switching fabric. (1/5)
— Bluehost (@bluehost) December 10, 2016
The Bluehost Twitter and Facebook accounts kept customers updated as network engineers worked to resolve the issue. Shortly after midnight Bluehost said they identified a network loop within a portion of the network. Staff worked to restore services “while making sure we do not reintroduce the loop into the network.”
At approximately 10 hours into the downtime, Bluehost updated customers who were still down, citing “a packet filtering problem” in its core routing layer for which the team had created a fix. Within a couple more hours most of the company’s customers were back online.
We have identified a packet filtering problem in our core routing layer. We have worked closely with our vendor to develop a global fix- 1/2
— Bluehost (@bluehost) December 10, 2016
Bluehost’s earliest communications about the downtime indicated a DDoS attack may have caused the incident, though this is no longer a strong consideration.
“It doesn’t appear to be a DDoS but we are conducting a full investigation,” Bluehost head of product Brady Nord told the Tavern after the incident. His team worked around the clock to identify and resolve issues until customer sites came back up.
“Many of our dedicated and VPS customers were affected to some degree for approximately 12 hours,” Nord said. “We made every attempt to keep our customers informed during the event as information became available because we understand our customers depend on our products and services.”
Nord would not share further details about the cause of the outage but said the company plans to complete a detailed post mortem to prevent future outages.
“With any significant event that affects our customer base, we conduct an extensive examination after the event to ensure we understand the root cause and develop a course of action to improve our systems and procedures,” Nord said.
Bluehost is one of the hosts listed on WordPress’ recommended hosting page and Nord said roughly 2/3 of the company’s customer base uses WordPress.
“The incident last night mainly impacted our dedicated and VPS customers which is a lower density section of the platform,” Nord said.
Bluehost has not yet published the results of its investigation, but support staff have replied to customer inquiries with a fairly definitive assessment of the issue as having been due to a spanning tree issue on their core routing layer.
Sure we can tell you what happened. We discovered a spanning tree issue on our core routing layer which caused network degradation
— Bluehost Support (@bluehostsupport) December 10, 2016
Spanning tree protocol misconfigurations can cause network problems similar to what Bluehost experienced but results of the investigation should confirm whether this was the root of problem that took customer sites down over the weekend.
Sorry for your hosting troubles.
Question: do you maybe evaluate your further hosting at Bluehost or maybe you re-think your hosting strategy with Bluehost?
Thx and good luck!