August 6, 2024 outage postmortem

Can we get an official note from Amazon that it was indeed their folly and that they didn’t inform well in advance, didn’t give reminders, didn’t honour the setting etc?

Very hard to believe otherwise that it is AWS who goofed up here.

5 Likes

Bubble must be a major AWS customer with tens of millions in revenue.
It’s weird to know that the communication between the 2 companies is not that good.

1 Like

At least bubble should cancel all the WU cost for days from yesterday.
We need to rebuild and re-run many backend workflows to fix issues caused by this.

2 Likes

Bubble is a small fish in the AWS ecosystem.

3 Likes

If you can’t trust bubble when they say something, nothing will help. Without trust you have nothing.

I trust bubble. If I didn’t I would not be using the platform to build apps that I must trust bubble with.

6 Likes

Still must be tens of millions a year for aws, something you give to a dedicated customer manager who handles the relation and ensure communication.

1 Like

Let’s not question the questioning by bringing angle of trust in a business transaction.

People have lost trust from their clients and customers and we can’t even ask for an official statement from third party accepting that it was their slip?

Tell me will it have more credibility if I show my customers a statement from Amazon or from Bubble?

Trust has its place. When they say they tried their best to fix it in shortest time possible, we trust them. When they say they didn’t have any other better solution but to wipe out data of three hours of people’s work we trust them. But if they are saying that Amazon slipped at so many fronts (Not honouring the setting of auto upgrade; Not informing them about the change in advance; Not giving reminders about the change), then in my view we paying customers who have trusted Bubble with our businesses and livelihood with them are within our rights to ask for a final statement that confirms it was indeed Amazon who slipped here.

Issue is pretty big in my view and Bubble is also large enough customer in my view for them to realise gravity of it.

Otherwise the first post here from Payam could just read “Outage that happened was not because of us, but AWS. Not our fault”. There was no need for him to add the details. Say he said that, then would people not be asking “Please tell us more details about it as it is not clear what went wrong, why it went wrong”?

There are so many things that Bubble says and we question them because we all are humans and we make mistakes in understanding, reading, assessing things.

Maybe AWS did send notifications and the person who was supposed to receive it, didn’t get to see them for some reason? They are still having dialogue with AWS on this and I am sure there will be more clarity and issues identified as the dialogue proceeds.

Also consider this. Few days back there was an outage in Windows based systems worldwide. Did people not trust Microsoft when they told that this was an issue with CrowdStrike? Did that mean CrowdStrike not have to issue a statement that it was their fault?

We definitely deserve to know the final statement.

2 Likes

Yes of course. I guess even a business that gives a business of 200k a year would be getting relationship manager, and Bubble surely would be much bigger account than that.

I don’t want to pile on here but I am also in the boat where we lost a ton of work so I hope that processes are being reviewed regarding data loss in the future. I am surprised that no email has gone out from Bubble. Not everyone looks at the forum the whole time. Not everyone would have seen the data loss and not everyone is possibly aware with the severity of such.

All, a couple of things:

  • The process is 100% being reviewed. Full technical postmortem reports, etc
  • We are connecting with impacted users directly because we agree not everyone is checking the community Forum for the latest updates

Also, I would like to emphasize that attacking other users for their comments and/or points of view about this issue is not okay.

7 Likes

Thank you for the detail and postmortem!

Thanks for this update. Seems like it was not your fault… kinda akin to bubble updating everyones apps to latest version without any warning. Must be frustrating.

I’m just happy that your team is on it so rapidly and able to resolve these things because I know that if i was responsible for it myself would be 10000x more stressful.

Warmest regards

4 Likes

Hi all,

Just wanted to follow up here and close the loop. We had a very productive and fruitful post-mortem with the AWS team (including their database team) on Friday. Last week’s issue was caused by a release that was scoped as minor but actually had a breaking change in it. AWS’ minor releases are supposed to be completely transparent and not have any breaking behavior, and the vast majority of the time, that is true — minor releases are done all the time transparently and behind the scenes. Meanwhile, their releases classified as major are known to have breaking changes, and the two follow very different change management procedures so that breaking changes don’t create impact.

AWS fully acknowledged that that there was a breaking change in the release, and it should not have been classified as a minor release. AWS is a valued strategic partner of ours, and they make mistakes just like we do. Moving forward, both teams have some short-term and long-term plans to make sure this doesn’t happen again. In the short term, all minor upgrades will be flagged in our regular biweekly session. In the long term, we are going to work more closely with AWS, particularly their database team, on testing, timelines, and rollouts.

As I said in my original post, delivering a stable product to our customers is ultimately Bubble’s responsibility. Our top priority is to learn from it and make sure we don’t repeat this mistake.

Best,
Payam

21 Likes