I have a serious rant to make on the capacity front.
Like many other threads have highlighted (list of threads at the end of this post), capacity errors have increased significantly in last one month or so. I am on a professional plan with an extra unit of capacity added to it. And at the max we may have about 50 users online at a time. Nothing significant has changed in recent past on our application apart from me having added lot of optimisations because of the WU business and buying another capacity (basically upping my capacity by 33%).
In spite of that I hit on capacity issues much badly nowadays. Today we were on max capacity for about 240 minutes (4 hours!!).
I have my conspiracy theories here.
Theory 1: “Calculation of wu” for each action itself is eating into our capacity maybe because it is using the CPU allocated to us and hence it is counting as if we are using that capacity.
Theory 2: Capacity definition has probably changed internally and now it is based on WU limits and not really earlier capacity definition
Theory 3: Bubble has reduced capacity limits without telling us. This I suspect is a cheap tactic to nudge to move all of us to new WU based pricing plan from capacity based legacy plan.
Theory 4: There is some bug and that has reduced capacity for our application instead of increasing it when we added capacity.
On top of this, both the tabs under Logs page “app metrics” and “server logs” are heavily heavily limited in functionality for us to be able to debug anything.
I had a thread and idea on server logs limitations here (how there are no filters, no download, unusable logs screen, insufficient information etc.)
Other than that, the app-metrics tab has umpteen issues as well. Like,
There is no way to debug “other” and “undefined” and most of the times that is a large contributor
The chart building is really really slow. Here’s a video that shows how the loader on chart building keeps running for even as long as half hour. And I do not even know if if it would be running longer as invariably in that much time I would have closed my laptop in between or editor would have crashed or something or other (yes, editor does crash at that frequency nowadays)
Here’s a video showing how app metrics chart creation keeps loading and loading forever.
Firstly the pie charts load slowly. Then if we click on any part of pie chart, the editor goes to show that workflow in editor and when we come back to logs, that pie chart is gone and we have to do all the steps again.
It is not clear what to do if capacity gets hit suddenly. Logs/Scheduler etc start responding very slowly too. And there is no clear way to see how many people are online right now, which are the people, what workflows/queries/page-loads are being run right now which are eating up capacity. One has to just wait for users to get frustrated and leave the app so that app can start behaving normal again.
If a particular page has consumed capacity, then which user accessed that page? When was it accessed? Was it SEO crawling? Is it DOS attack? How to see its pattern of access with respect to time, volume, IP?
Page loads are not even shown in logs. What to do then?
The capacity business is very very opaque. Since they are able to quantify capacity (app used 80% of capacity etc.), they should be able to tell what caused the capacity to be consumed more in a numerical way. Suddenly capacity usage jumps without any real thing happening on the app is just so absurd. Something has to go really wrong for capacity to go from 10% used to 90% used suddenly.
What happens when server capacity is hit and some jobs are running? i.e. What happens to front-end tasks, what happens to backend api flows, what happens to new requests, what happens to old requests etc.
We don’t even seem to have ability to tell bubble to run some requests at low priority if required. Our capacity utilisation most of the times is pretty low, so for momentary spikes upgrading capacity doesn’t make sense. So they need to give ways for us to run workflows such that they don’t affect capacity in high traffic time.
I don’t know what to do if some workflows fail due to capacity issue. How to easily identify those and have them run again? Is there a way in backend workflows to catch those and run them again?
Can’t get to know which query ran how much times and consumed how much capacity. Chart tells that this workflow took up capacity, but then it is not clear that it happened because this workflow ran too many times or that the workflow is a heavy one. Bubble needs to give that information for us to make any optimisation plan.
The tab of “scheduled workflows” itself stops responding when capacity is being hit.
Can’t get to know if a particular user ran some heavy queries/workflows that caused capacity hike
For those who are going to tell me to raise bug, please don’t. I have raised a lot of bugs and am tired. I might probably be having highest number of open bugs with Bubble right now.
I suppose some might tell that I should switch to new pricing plan to avoid capacity issue. But I can’t. My WU usage is quite high at the moment (Bubble to be blamed largely as I can’t do many things optimally in current Bubble framework). I will try to optimise etc further and that’s what Bubble has given 18 months time for. Those people who are going to utilise this time to optimise shouldn’t be getting step treatment? It is a different matter that I am optimising so that once 18 months window is crossed, I can still have some more time bought for myself for me to switch to traditional code. This Bubble business can’t be long term anymore.
Here’s a list of threads on this topic by other users, so definitely it is not just some hallucination that I am having because of frustration from pricing update: