Sorry-Need to Revisit-BEST Way to Retrieve Data-Speed Concerns

Ok, so I have been using Bubble for quite sometime now. And I feel like what I am seeing right now has turned most things I’ve come to learn about Bubble in concern of speed for retrieving data has just been turned on its head.

From different threads I’ve come to be of the understanding that using related data types is faster to retrieve data than to search the data type alone.

For example, I have ‘Analytics’ as a data type. It has fields (share date, ID (text value representing another data type it belongs to) plus some fields for Type and Sub Type which are option sets.

The ID of the analytics is the ID of the Job listing so I can match the analytic to the listing it belongs to…I set it up like this because I will use all Analytics for all my 'listings such as blog post listing, retailer listing, event listing etc. so I needed a catch all way of making a connection.

Screen Shot 2020-06-28 at 1.56.26 PM

I have a job listing data type. It has a data field which is a list of ‘Analytics’.

Screen Shot 2020-06-28 at 1.57.57 PM

My previous thought was that it would always be faster to on the Job Listings page use the ‘parent groups analytics list’ to search and retrieve the ‘Analytics’ data associated with that Job Listing. It would be faster compared to search for ‘Analytics’ and filtering by the Job Listing.

So my previous assumptions were that this search would be faster

I thought it would be faster than this search below because the search below is having to filter ‘all analytics’ while the search above is using only the analytics that are part of the field list of analytics on the ‘job listings’ data type.

When I test this using the chart element as well as a regular repeating group for visualizations of speed differences, the difference is dramatic.

The first search is the chart on the left, while the second search is on the right.

Notice from this gif that both charts are empty on page load…then after selecting the ID of the Job Posting the chart on the left loads almost instantly while the chart on the right takes 25 seconds longer than the one on the left to load the same data. The only difference is the chart on the right uses the list of analytics that is a field on the parent groups data type, while the chart on the left is filtering all analytics using the parent groups ID.

Here you can see the job listing has a field that is a list of ‘analytics’ which is about 100 times slower to load.

I’ve seen a lot of people make the same claims I always made about the performance of data retrieval being faster when having a related list field on a data type rather than filtering a data type to find all that belong to another data type.

Is there a best way for a specific use case? When would it be better to relate the data types and when would it be better to filter one data type or is it always the latter over the former?

1 Like


The problem is that the “grouped by” feature is not optimized.
Please view the network tab of your console browser to understand the behavior.

Let me know if you have any questions, please.

Have a good one!

I really don’t know how to make any sense of the info in the console.

So, the idea that the ‘grouped by’ feature not being optimized would infer the idea of best performance using the list of things as a field on a data type is faster, however, because of the ‘group by’ not being optimized it is not true when using the grouped by function?

Sorry. A quick question.
How many “Analytics” records do you have for a “Job-Listing” when you were testing it?

Analytics has 1,560 records…all have the same ID, however, they have about 9 different Sub Types.

I was planning on making more test data with different IDs to see how the performance varies.

Imagine that Thing A has a field with 50 Things B.
When you use filter, the system initializes all 50 Things B - and then filters these.
That’s why it works slower.

Regarding group by feature
Sorry that I’ve confused you. At first, I thought you’re talking about another case.

Sorry but this confuses me more right now. What I am experiencing with the :group by feature is that I have a Thing A with a field of 50 Things B.

When I use the :group by feature and I am calling ‘parent groups Thing A list of Things B’ this is working dramatically slower than if I just ‘do a search for Things B: filter by ID’ (where the ID is a text value equal to ‘Parent Group Thing A ID’)

This is why I am confused about the :group by feature…it seems like it turns the whole idea that having ‘Parent Group Thing A list of Thing B’ being faster than needing to filter through all of the Thing B to locate (filter) those that have a field that some how relates to Thing A.

Because from my understanding, having a list of Thing B as a field on Thing A means that when I do ‘Parent Group Thing A list of Thing B’ I am only asking the DB to retrieve the list of Thing B that is on the parent group Thing A and the DB would not need to query the entire database of Thing B.

For example, if I have two pieces of paper, one is List of all fruit and the second piece of paper has a list of all citrus fruit. I feel like when I utilize the parent groups Thing A list of Thing B, it is the same as saying look at the piece of paper with list of all citrus fruit. While if I were to ‘do a search for Thing B : filter by’ that would be like, look at piece of paper one with all fruit and find all the fruits that are citrus.

Aha. I see the confusion.

Have a look at this one, please (there is a video):

In short, when you make a do a search for things with constraints, the system doesn’t take each record and compares it to the constraints, for example, id = dropdown’s value.

Let me know if you have any questions, please.

1 Like

Thanks for the reply and the links. The thread you linked helped me to make more sense of things and the different options available and when one would be better than the other.

Based on a reply from @NigelG that I think makes things a bit easier for me to understand.

If you can move something from a filter to a search constraint … do it. Anything you can do on the main search constraint to reduce the load coming into your filter is a good thing. It is “cheap” searching.

And another response from @vladlarin

Searching trough 10M records is not the same as fetching 10M records.

Also reading through the bubble manual that indicates and record with a list of things, the list will max out at 10,000 entries, the best approach is the one that I have seen faster results with so far.

I am searching for ‘Analytics’ and on the search putting two constraints. 1. The parent groups job listing ID; 2. The analytics ‘sub type’ which as suggest by Nigel would reduce the load coming in since I have my constraints on the search and not the filter.

What this looks like in the D.B. is one thing with no lists

Screen Shot 2020-06-30 at 3.08.06 PM

Which would mean that it would never max out at 10,000 entries as a list - so if for hypothetical purposes a single Job Listing got 20,000 facebook shares, the DB would not ‘max out’ if I had instead created a list of ‘Analytics’ on the ‘Job Posting’ data type like below:

Screen Shot 2020-06-30 at 3.09.54 PM

I believe I am now on the correct path to do more searches of larger ‘lists’ for my dashboards.

For example, if I wanted to create a dashboard for a ‘Retailer’ to see all of their ‘Orders’, I shouldn’t put a ‘list of orders’ onto the ‘Retailer’ data type and instead I should create a ‘Joining Table’ called ‘Order Join’ to have a field ‘Order’ and a field ‘Retailer’ and then when I want to display all orders for a retailer I can search the ‘Order Join’ and constrain that search by the ‘Retailer’ field.

So once I have my results in a RG of ‘Order Join’ I can click on it and refer to ‘This Order Join’s Order’ to pull up the actual order and its associated details.

Would you say based on the idea that when searching all of the related data is returned, that it would be better to not actual related the ‘Order’ field to an ‘Order’ data type and instead just use a text field to reference the ‘Order’ datatypes ID?

In a workflow to display the ‘Order’ in a group of the datatype ‘Order’ I’d set up like below

And If I wanted all of the Order associated details in the RG of ‘Order Join’ I could do something similar by performing a search for ‘Orders’ to display the associated details in each cell of the ‘Order Join’ RG.

Or would the fact that each RG is doing a search to populate each cell with the details of the order cause large lag in returning results?