[Ongoing Issue] Problem with Logs -> Capacity charts

Hi all,

Our capacity, workflows, and page view charts are currently down. We’re working to bring them back up, but we don’t have an ETA to resolve because the issue is that we’re hitting an undocumented limit with a downstream service provider: we’re currently working with their team to resolve. I’m creating this forum thread to track the current status of the issue: will post updates here as we have them

12 Likes

Thanks for letting us know, Josh!

1 Like

As long as AWS doesn’t go down, I can deal with some analytics not showing! :slight_smile:

3 Likes

Rate limit hit. Sounds like growing pains to me, as Bubble usership grows as a whole.

Thanks for investigating! (Constant cap monitoring is actually semi-important to our app)

3 Likes

Thanks for everyone’s patience. We have a path forward to get this back up and running again, but unfortunately:

  • It requires rebuilding parts of our code, so it will likely be at least a day or two until we’re back online
  • We’re going to lose the historical data: no way of porting it over to the new system

Really sorry for the disruption here

6 Likes

Hi Josh, thanks for the update.
Question-- is this issue affecting other workflows…?
I’m getting a bug notice in my logs about an API Connector workflow not running:

" Workflow error - Sorry, we ran into a temporary bug and can’t complete your request. We’ll fix it as soon as we can; please try again in a bit! "

Just trying to figure out if it’s just me breaking my own things or not!
Cheers~

2 Likes

Can you please clarify Josh on what data has been lost? We ran a test of our platform (taking 2 weeks to plan) during the time Bubble had this outage. Is there a way we can get back the data lost? Critical we get that data - we have lost 1,000 man hours of test time (a 100 person event for an hour) if we cannot get that data back.

Hello @chris1111 Contact directly the support@bubble.io

1 Like

Hi @ryanbeaucher , likely unrelated to this. That said, that error message generally means there’s a bug on our end, not your end, so please feel free to file a bug report and we can investigate.

@chris1111 – answered you in a DM. Short version: we’ll likely be able to recover historical data on a manual, one-off basis, and will consider doing it in special circumstances like these, but won’t do it for people generally.

2 Likes

Hi all, update on the situation. The good news is if you check your apps, you’ll see that the charts are working again! However, there are a few caveats:

  • We’re still confirming that the approach we took to fix it solves the underlying backend issue that caused the initial outage. We believe it does, but there’s some hopefully-small probability that we’ll need to disable the charts again in the next few days
  • You won’t see data prior to when the charts stopped working. As mentioned above, we do still have that data, but in a manual, hard-to-access form. If for some reason it’s extremely important for you to look at the historical data, please email support@bubble.io with an explanation for why you need access, and we’ll help out on a case-by-case basis.
  • We had to change the library we were using to render the charts. The new charts aren’t quite as feature-rich as the old ones were: they don’t currently support real-time updating, and you can’t hover them to read specific datapoints. We wanted to get something out the door ASAP, so went with this solution, but we’d be open to hearing feedback on the new charts and gradually improving them over time

Again, sorry for all the issues here!

5 Likes

Given that the subscription model is capacity based, I rely heavily on the real time charts to see the effects of my coding on available resources.

1 Like

Working on your site affects overall site capacity?!? That’s news to me. There’s so much to learn in bubbleland

Not coding specifically, but testing. Some user actions trigger many workflows so it’s critical to see the impact on resources so you can decide if the processes is ok, needs to be rewritten, split up into smaller steps or perhaps moving some items into backend workflows. Hitting resource limits is very nasty as workflows can get terminated mid-cycle.

2 Likes

Is anyone seeing very different reporting than usual with the new charts?

Previously, with the old charts, our app was running at about 2-3% of capacity most of the time, with occasional spikes up into the 20-30% range at busy times.

Now, since the new charts are showing, it’s pretty much off the scale for the past 3 days!! (we haven’t had any increase in traffic or usage)

The other chart doesn’t show any period of max capacity for the same period

Are the new charts measuring the same thing as the old ones? Or are they not working correctly yet? Has anyone else seen similar reporting?

1 Like

Mine is showing lower cpu spikes and higher work flows. Unfortunately I have not kept more detailed records to see if somehow my work flows are actually executing faster. Wouldn’t that be neat…

1 Like

Interesting that you’re seeing such spikes. I cannot imagine why the charts/analytics would impact your app.

It seems that the detailed capacity section is up, can you look further into what is causing the spike? Perhaps a silly question, but have you added any thick plugins or APIs recently?

No we haven’t made any changes over the period shown, and there’s nothing in our logs or scheduler out of the ordinary.

I guess I’ll just have to keep an eye on it over the coming days…

@adamhholmes Mind opening a bug report with us and sharing which app it is? It’s possible there’s a bug with the new charts and there’s something wrong with the data they’re displaying , rather than an actual change to your app. It shouldn’t be showing > 100% CPU usage, especially if your app is running normally and not having an outage

Hi there,

Yes, it’s happening to me as well.

Same issue.

what is interesting is if I look at the 24 hour view, the CPU usage goes off the charts towards the right of the graph.

But if I look at just the last 6 hours I don’t exceed 50%: