Transparency in WU Charges: Let’s Unite for More Clarity!

some other tests

created data structure
thing > linked thing

if displaying any field from linked thing on the page then the entire of linked thing is downloaded to the client - not just the fields that are used on the page

ie things > linked thing’s name

will download the entire of linked thing’s data fields

interestingly the “deleted” field only shows for the records that existed when that field existed and it is not returned for new records created after that field was deleted

doing a nested search on the page returns all the data from all the nested levels

ie

thing > related thing > other things

will return all data from all 3 levels for all items returned in the search for each level not just see through the data and return only the relevant data ie the end result of other things

instead it does a search for each levels data type and returns 3 search results including all the fields for each data type to the page when it could do those searches server side and just return 1 list to the page since the client only wants the “other” things and has no interactions with the other 2 levels

no wonder bubble apps are so slow to load lol


I did another test where I siloed data on the main object

ie data (pages data thing)

which has these linked datas to hold further fields

data_properties
data_counts

then I added the result of these linked datas into a text on the page and hid that text until a button click revealed it

the result was that on page load no server searches were done (because pages thing is stored client side - at least bubble docs got that right lol)

then when I click each button data lookup is performed for the linked datas fields. so by segmenting that into silos I effectively branch out the server requests and reduce initial data sent to the page on page load.

the downside here is that I effectively do 3 lookups to fetch all the fields
the upside is that the inital page load is much faster and repeating groups are much faster since only minimal fields are returned for the “searched data” and the other fields are only returned when viewing the selected item


so it seems to me that the most optimal way to setup database structure is a hub and spoke method?

lightweight searchable data linked to one or several heavy weight detail datas

ie
contact
contact_detail
contact_preferences
contact_interests

1 Like

This cannot be possible… If what @georgecollier and @mitchbaylis are saying is indeed true, we are facing a very serious situation. I want to believe there has been some misunderstanding on their part, because I’ve never seen anything like this—a platform where the official documentation and support recommend practices that, instead of optimizing, end up harming performance and increasing costs.

Given these facts, which I sincerely hope are a misunderstanding, it’s hard to accept that the official documentation recommends “best practices” that, in reality, worsen performance and elevate the consumption of Work Units (WUs). Trust in Bubble’s documentation and support is now seriously compromised. Even more concerning is that these “errors” always seem to lead to increased WU consumption and, consequently, higher costs for users—never the opposite. This raises doubts: is Bubble turning a blind eye to these issues and focusing solely on profit instead of the reliability and quality of the systems being developed?

If we cannot trust the official guidelines, whom should we rely on to develop safely and efficiently?

If this is not a misunderstanding, I ask Bubble’s support (@josh , @emmanuel , @petter , @steven.harrington ):

  • What urgent measures will be taken to correct the official documentation and protect the community from misguided guidance that increases costs?
  • Will the support team be trained to understand and accurately explain Bubble’s internal workings, thereby preventing incorrect information from being passed on to users?
  • When can we expect transparent and reliable documentation that thoroughly and precisely covers the real impact of each practice and WU consumption?

The idea that we, the users, need to “discover” these flaws in the documentation and correct our practices on our own is absurd. It’s hard not to think that Bubble is prioritizing profits over ensuring the reliability and quality of the applications created by its community. The community invests time, money, and trust in Bubble, expecting the provided information to be correct and to help us build applications efficiently. If Bubble wishes to maintain the community’s trust, it’s essential to address this situation promptly and restore the credibility of its support and documentation.

Yes, look up “satellite” data types. This is well-known.

2 Likes

There was a big topic about this a couple weeks ago. The only fix is to bring it up to support so they can delete it on their end. Really huge security flaw I agree.

Thank you, @mitchbaylis , for conducting these tests and sharing the results. It’s essential that the Bubble team revises the documentation to include these details, ensuring users can optimize their applications with confidence. I also emphasize the urgent need for more education in optimization and security, so the community can develop with clarity and assurance.

1 Like

It’s frustrating that Bubble still hasn’t provided an adequate solution for the issue of residual data in deleted fields. If security were truly a priority, the least they could do would be to add a warning in the editor, alerting developers to remove associated data before deleting a field. This simple step would prevent sensitive data from remaining accessible, enhancing application security and building trust in the platform. But then again, perhaps it’s more interesting to keep support busy with manual deletions… after all, more fields returned mean more WUs consumed – and more money in Bubble’s pocket.

1 Like

They won’t do anything about it until there’s a massive leak of a big app with a Youtube video breakdown with millions of views.

Thanks for the message and info @mitchbaylis , super interesting and helpful to know.

Peter Amlie talks about this in his book “the ultimate guide to bubble performance”. They are called satellite data types and has also done at least one long YouTube video with JJ Englert from no code alliance that goes over the topic of satellite data types.

1 Like

there is a problem with satellite data types though given how bubble processes the data.

bubble documentation tells us not to do nested searches/filters

so setting up a data structure like
contact
contact_detail
contact_preference

is non sensical because it forces the user to apply either an advanced filter or a reciprocal data lookup, or duplication of key data onto multiple data levels to get performant searches.

ie
search contacts (where phone is empty) filtered to contact_details bio is empty

you could do that several ways
store the bio on the contact data - but bio is potentially a heavy field, and the user may search empty bios very infrequently
create a boolean on the contact for empty bio - then requires a bulk update and a workflow to update this second field
search the contact_details for empty bios and return a list of contact_detail>contacts and then apply the filters on the contact data level to filter the contacts for phone
search the contacts filtered by phone, then apply adv filter to filter contact_detail for bio empty

so satellite data types have their own issues in that they complicate searches considerably and potentially cost a lot more for workload usage. on top of that all the data for all the satellites are returned not just the data used in the logic.

whilst I use satellites in my data quite a lot my understanding up until yesterday that they would be “look-through” data in that I could add references to nested data and only the called field would be returned but that is not correct.

bubble returns all the fields for all the nested levels, so if you do something like

search of contacts with empty phones, filtered to detail has empty bio and where the list of social_accounts contains facebook then bubble will return the contact, the detail and all the social accounts along with all the fields on all of those data types… making satellite data actually more intensive on WU costs than a simple flat data structure anytime that you do a nested search like this.

not only does a satellite data structure force you to do a lot of filters on large lists but it also forces you to filter at multiple levels - further increasing the WU cost.

in short there is no one way that is the clear or correct way to setup the data structures to optimize for WU costs. instead there’s a list of options and the end user (often not a developer) is expected to be able to pick the best one for their particular situation. And more importantly, expected to be able to see far enough into the future to foresee all possible interactions of all fields they are organizing into their database structures. Else they are locked into moving the fields later and running expensive bulk updates, or doing expensive data searches.

It seems there are several really critical and fundamental issues with bubble data structure, searches and WU costs and optimization methods and no real solution or best practice for any of it.

1 Like

You’re overthinking this hard. Bubble recommends avoiding deeply nested searches, meaning you shouldn’t go more than 1 or 2 levels down, which should fit the overwhelming majority of use cases using satellite data types.

It’s okay to go Thing’s Child Thing, it’s not gonna blow up your app.

The documentation is basically there to avoid Thing’s A’s B’s C’s D’s E.

You are also misunderstanding the “look-through” aspect. If Thing A is connected to Thing B which is a massive data type, you don’t download Thing B just by searching for Thing A. That’s the point. Thing A just contains a lookup id to Thing B. You only incur the cost of “the entire Thing B” when you actually need it.

2 Likes

On a satellite data type you can have a field of type text where you store as a single text a list of things separated by comma that are the values you want from the other data types

For example, Name, address, height, related data value…this way you can have a satellite data type that only returns the values necessary.

Of course this requires use of some workload units to keep things in sync while they change, but that is likely less than the cost of the searches of multiple data types.

I understand the comments above but maybe you can help clarify this with an actual use case.

I have an event
on that event I put a list of acts that happen at that event
for each act there are a list of times that the act happens
for each time there are a list of positions of people that perform that act at that time at that event.

Now if I wanted to get a list of all the positions at the event that’s easy enough - I just store the event on the position and do a search.
But if I want to get a list of all the positions at the event where the act has been accepted then I need to store the event act approved on the positions - ok fine.
But if I want to get a list of all the positions at the event, where act is approved, and time is also approved - now I have to store the time approved on positions.

Now, user comes back and says I want to see all the positions that are circus acts.
so now I have to store the event act type on the position. Well actually the event act itself is a unique pairing of a specific act at a specific event (since it needs approvals and other fields per event). So I need to get the event acts act type and store it on the position.

But now the user wants to filter times and event acts by act type as well… so now I have to store the type on position, event act, time.

AND I have to add workflows to keep all these extra fields updated whenever the main sources field changes just so I can do performant 1-2 level searches.

OR I do a nested lookup
search for positions constrained to event, where event act is in search of event acts constrained to approved and where act on event act is in search of acts constrained to type = circus

3 searches with hundreds of datas returned in each and then a heavy if in this list and this list and this list filter.

In the chrome dev tools bubble returns to the client:
list of positions
list of associated times
list of times associated event acts
list of associated event acts acts
and all associated fields for all 4 types.

my point here is that it doesn’t matter which way you build the database in bubble you are either tasked with:

  1. creating duplicate fields on satellite datas to facilitate performant searches and then trying to maintain them
  2. doing heavily nested searches to access data on linked records and returning all the non-relevant data for each step in that linked chain (all fields are returned to client not just requested fields, I found this out yesterday when checking the received data in dev tools).

I originally shared a simple example to try and explain the point clearly - but I can see how that example looks like overcomplication. The example here is a real example and one of the simpler examples for that applications database structure.

I understand the idea of thing A having a lookup of thing B - it just holds the id of thing B until you need it.
the issue is when you request thing B it downloads the entire of thing B. And if you do thing A > thing B > thing C > thing D then the entire of ABCD is downloaded even though you might not be using anything other than the nested id field (1 field each) on thing B and C.

Yes, do this. It sucks and it’s not really “correct” from a general computer science perspective but it’s the only way. Tradeoffs.

1 Like

Looks like Bubble is just waiting for that viral video of a major data leak before finally taking action, right?
If security isn’t a real priority, maybe Bubble should stop promoting that it "helps users build better apps.

If Peter Amlie has already covered these concepts in his book “The Ultimate Guide to Bubble Performance” and even created videos with JJ Englert on satellite data types, why doesn’t Bubble incorporate this content to better educate its users? At the very least, they could add these concepts to the official documentation to help the community optimize their applications.

But maybe the explanation is simple: the more users understand optimization, the fewer WUs they consume – and the less profit Bubble makes. The formula is clear – less optimization means more WUs; more WUs mean more revenue for Bubble. But what about the financial health of the businesses that rely on Bubble? It seems like the commitment to education and quality takes a back seat when profit is at stake.

1 Like

Yes, it does create perverse incentives.

Under the old system they were heavily incentivized to optimize the behind the scenes stuff but not anymore.

1 Like

Would you like to alpha test this then, which will be the best one-off app audit tool that exists for Bubble whilst also being free?

Context aware redirect detection…

And context aware privacy rules (i.e don’t flag stuff that’s meant to be public)…

I wonder what cool things that ‘fix issue’ button might do :grin:

3 Likes

Exactly! It really seems like Bubble isn’t following the forum, or at least isn’t giving proper attention to such an important topic.

Given that Bubble doesn’t seem in a hurry to educate users correctly on how to optimize their applications, it’s hard to understand why no support team member has engaged with this post yet. Could it be that the financial incentive of the Work Units model is actually outweighing the interest in platform improvements and optimization for users?