Should we be worried?

Hi all – @ralphlasry’ s speculation here is on point. As I announced, we’ve now taking a much tighter policy with making sure our status page is updated whenever we suspect there’s something going on affecting multiple apps, as well as always erring on the side of paging engineers. Pretty much all of our recent incidents haven’t affected the vast majority of our user base.

I realize this leads to the opposite problem – our status page looks really noisy – but we’d rather err on the side of avoiding people seeing a problem with their apps but no status page update, vs the other way around. We’re working to clean up some of the noise by isolating some of the incidents (currently, it’s been showing up on our status page when we respond to a Dedicated box going down, which we are working on updating our tooling to hide, since it’s not relevant to anyone but the specific affected customer), as well as getting more consistent at retroactively indicating the severity once we’ve gotten to the bottom of the issue. This should be improving over time, but we decided it was better to initially over-correct in the other direction and get very fast and transparent about responding to problems.

We do plan to continue posting public postmortems for large-scale incidents that have a meaningful impact on our overall uptime across the majority of our user base.

Starting earlier this week, we froze production deployments for Thanksgiving / Black Friday, and won’t be resuming til Monday (with an exception for time-critical fixes in response to production issues, following a very strict internal approval process: we have not had to make an exception so far)

13 Likes