Platform Stability Update – March 31, 2025

Hi all,

I want to share an update and provide transparency around the recent outages and instability we’ve seen, including incidents earlier today.

The second half of March has been rough and we apologize for the impact that has caused. We’ve had multiple disruptions, and they stem from two main issues:

1. Dependency on a third-party CDN (unpkg)

Many Bubble plugins depend on code hosted on unpkg, a third-party CDN that experienced a major outage on March 19 and again this morning. Because these dependencies can block page load, an unpkg outage can cause downtime for apps with those plugins installed.

While this issue is outside our direct control, we’re looking into ways to mitigate it. That includes giving plugin authors better tools to avoid blocking dependencies on third-party CDNs and encouraging widely-used plugins to migrate to using those tools.

2. Database shard crashes

Two of our eight main database shards have experienced repeated performance degradation and crashes over the last two weeks, leading to downtime and poor app performance for affected users. These issues are caused by long-running queries that exhaust database resources and trigger cascading failures.

We’re addressing this on several fronts:

  • Long-term: We’re implementing automatic cancellation of long-running queries. This depends on completing a major internal project: migrating off a legacy stored procedure language. We’ve been working on this since early last year and are nearly done — 47 of 54 procedures have been migrated, including the most complex. We’re targeting end of April for full completion.

  • Short-term: Given the urgency over the last two weeks, in parallel, we’re fast-tracking partial query cancellation based on the procedures we’ve already migrated. It’s trickier to implement this prior to completing the full project, but we’re optimistic we’ll have a working solution in the next few days.

  • Ongoing mitigation: We’re addressing problematic queries individually as they arise. This hands-on approach has helped us maintain some stability, and we believe a change we made this morning will improve things further.

  • Manual intervention: Our team is monitoring the situation around the clock and manually restarting affected databases to restore functionality when needed.

We know you rely on us to keep your apps — and your businesses — running smoothly. Our Platform engineering team, the largest team at the company, is fully focused on scalability, reliability, and performance. Every improvement they deliver benefits all our users automatically — no effort needed on your end.

Finally, as a reminder, for mission-critical use-cases, I encourage talking to our Sales team about our Enterprise offerings, especially dedicated hosting, by reaching out to sales@bubble.io or submitting a request here.

Thanks for your patience as we work through this.

—Josh

26 Likes

Thanks for the update @josh! One thing I admire about Bubble is the transparency that’s given to users like myself :raising_hands:

I think one place for improvement would be communication during incidents of platform outages or degradation. The messages on the status page are pretty vague right now. For incidents like the unpkg outage, if are communicated with steps app developers can resolve on their own, it would probably benefit everyone

9 Likes

agreed. nothing is hidden and no trust is lost. thanks josh

2 Likes

TLDR; :folded_hands:

2 Likes

Thank you Josh for all yours and engineering teams’ efforts improving reliability. Won’t localising shared clusters in various regions help towards improving reliability? Also, maybe you should prioritise dedicated hosting at a reasonable price-point too. Many of us will embrace that.

2 Likes

^ :folded_hands::raising_hands:

Please create some sort of alert and email when processes fail or are cancelled. A couple weeks ago I spent many hours trying to figure out why an action didn’t complete as expected only to finally discover that the operation itself timed out.

Similarly today, I didn’t know there was a Bubble outage and spent a lot of time trying to find the problem, and then a bunch of time crafting a bug report, only to finally realize that the flakiness I was seeing was due to Bubble itself.

Thankfully this was on DEV. But on Live, this could be disastrous.

4 Likes

Not sure how possible, but would be great to add some actions/conditionals for users that allows triggers based on bubble server performance etc.

For crude example “if server-outage is yes then (navigate to page X)”. At least then we could direct users to a polite notice page or something.

(Edit: I guess the catch is that it wouldn’t even get that far unless this check worked at a lower level, so maybe an option in settings for “page to redirect to when server is down…” would be better.)

3 Likes

Agreed – we’re in the middle of switching incident response tools (from pagerduty to incident.io), in part because it’ll give us a more integrated workflow for real-time updates in emergencies, which will make it easier to build process around this that doesn’t distract too much from the engineer actively working on fixing the problem

Also agreed. We’re in the midst of an exploratory project to see if we can migrate our entire Logs / monitoring stack into a Bubble app, which would allow us to iterate much faster on those features without slowing down the other engineering work we’re doing.

Definitely on our long-term roadmap to have user-defined error pages (hard to get too much more granular than “here’s a page to show when your app is down”, for the reason you mention). Probably not coming in the near-term, though.

11 Likes

You can use the Bubble status page API to implement this already :slight_smile: NQU Secure shows alerts when Bubble is reporting issue so people know that weird things might happen

6 Likes

@josh why would you leave critical database tech debt unfinished and start work on mobile native and AI?

I think it’s important to note that there are different teams focusing on some of these areas. Balance is important, but there’s nothing wrong with working on future features while addressing present issues and updates.

3 Likes

Different projects require different subject matter experts. Allocation of appropriate resources to meet different organizational goals is important for long term stability and growth.

I lead my organization’s planning and development department. I won’t tell my project team to work on marketing when I have a marketing team and I won’t task either teams to work on my digitalization projects.

On that topic, broad criticisms like this have been popping up very often recently…and they are very cringy to read sometimes. Especially from Bubblers I have come to respect.

I too don’t agree with the prioritisation of some things (cough… better tools for WU control…cough…native Javascript support…cough) but Bubble has been doing a lot of work on stability and clearing tech debt, it just does not get the same buzz as mobile and AI. Sad but true in all dev work.

3 Likes

Good to see transparency on this and thanks for the informative update.

Emotions run high when there are production issues.

This transparency showing you @josh are in control of the situation calms everyone down. So cheers for that.

And you can get lots of “advice” on how you “should” run the business. :rofl:

4 Likes

Thanks for the update!

2 Likes

Keeping fingers crossed for the tech team!

Thanks for your update @josh. To us, as users, it’s very important to have that information.

Ooh, this would be cool to integrate into the Codeless Love Powerup extension.

2 Likes