Can bubble’s system handle me uploading millions of rows of product data?
Can bubble’s internal query system handle searching on such a large dataset?
If you’re ready to pay enough, yes.
So a user trying to filter on a bubble hosted DB is not going to have issues with slow response times, and timeouts?
Why are people needing to use Algolia?
It should work okay without using the Algolia.
I had the same question before, so I decided to test it.
There are 1 868 420 of records:
The following preview page shows the total and the first four records immediately:
Note: The app is on the free plan.
The most important thing is to structure the database in the right way.
For instance, for a Product table you need to create an additional one Product Details that will have such fields as a full description, list of images, etc.
The idea is to have a light table to avoid issues with its slow preloading/loading.
You need to be ready for the following things to do in the case if you want to use the Algolia:
- Custom pagination
- More workflows for displaying data because you need to run it within a workflow’s action
- Handle things on the Algolia side
- Additional privacy configuration
Nice example and some good points raised on setup.
Interested what the capacity logs look like when working with this amount of data in terms of performance capacity and if it maxes out quickly doing to many searches? Cheers
That is a good question!
I don’t have an answer for this. It might be a good idea to check it out.
I’ll let you know if I’ll have an update for this.
Thanks - it indeed would be interesting to know the impact of this amount of data entries.
The selected answer definitely provides useful info but I think in general gives the wrong impression. If you were to take that same database with 1Million+ urls in it and try to search for the ones that start with “goog” it would take a looooooooong time for "google.com to pop up.
Someone correct me if I’m wrong but native search in Bubble simply does not work well on datasets with more than 1 thousand to 4 thousand entries).
I can also confirm (here 5 months later). In fact, a dataset with 1 million entries will simply not return a result as it will time out.
To get “instant search” you have to create a sort of “shadow” index in a search engine such as Algolia, Typesense, or ElasticSearch where the records include the fields you want to search and then a reference to the bubbleID of that record in your app. Then use the native Algolia integration (or rolling your own is fairly easy, same for Typesense or Elastic) to search against the shadow index. The only thing your search engine has to return as a result is a list of BubbleIDs corresponding to the matched records it found. Then your app displays these records as their actual corresponding Bubble thing in a reapeating group.
Since my last post I’ve built custom integrations like this in Algolia and Typesense and it works pretty well. Annoying part is ensuring the shadow index is always synchronized with your App’s data. Even so, backend workflows that update the shadow index anytime a record changes works well most of the time (at least on a professional plan).
I think it depends on what kind of search – client or server side. I’ve found server side searching (stuff in the regular “Search for” box) to be very fast on my largest dataset (14,000 entries). But if it is client side (most filters and all advanced filters), then it bogs down.
Interesting. So does that good server side performance include searching for records by partial string match?
If you mean a “contains” operator in Bubble, I don’t use one in my app (I use “contains keywords”). But I just put together a simple “contains” search on my dataset, running it on a website URL field I have in each entry.
As a practice I always use a “submit” button workflow to copy an input’s value to a custom state, and have the RG filter based on the custom state. It’s a much more efficient use of capacity to have the entire search sent through at once, rather than Bubble having to keep filtering and updating a RG realtime as users type in terms. Under this custom state setup it returned results in about 1 second. When I just had the RG filter directly on the input field as I typed, results came back in about 2 seconds.
That’s good information. Thanks for sharing.