Bubble DB Challenge - Searching By Tags

I’m currently doing some testing to determine the best way to achieve something in Bubble (something which is a very common requirement, and should be very simple but, due to some of the limitations of Bubble’s database query functionality, is way more challenging than it should be), and I’d be interested to hear how other Bubble devs have approached this in the past, or any creative ways of achieving it.


The Situation

There’s a Datatype of ‘Tag’ (with approx 100 Tags in the database)

There’s a Datatype of ‘Content’ (with 15,000 entries in the database)

Each Content item can have one or more Tags assigned to it


The Requirement

The requirement is simple:

A user should be able to search for Content items by Tag, by selecting one or more Tags from a multi select dropdown.

There are 2 required use-cases:

  1. Return ALL Content Items that contain 1 or more of the selected Tags
  2. Return ONLY Content Items that contain ALL of the selected Tags

In both cases, the Content Item’s Name AND the name of all associated Tags should be displayed on the page.


The Problem

This is an extremely common requirement in any kind of listing app (whether it’s blog posts, products, or any other kind of item that can have associated tags/categories etc.) and yet doing this in Bubble can be very difficult. I’ve seen a lot of apps that try do do this and end up being extremely slow and/or costly in WU.

Doing this in a regular SQL database is both very simple AND fast, and fully scalable (I’ll share some examples later), but because Bubble only provides very basic DB search functionality, I’ve never come across a truly satisfactory way to do this in Bubble.


The Solution

Over the years I’ve seen a number of discussions, on the forum, and elsewhere, about various ways to do this, and I’ve built this type of thing into multiple apps myself with various different approaches - but mostly on fairly small datasets (where the challenges are less noticeable).

If the datasets are small, then there’s not much issue - everything can be done client-side and there’s no real problem. But on a larger database things get more complicated.


So, I’m currently testing multiple approaches for this (about 8 different methods so far) and none of them are truly satisfactory (some are completely un-usable at scale - 2 are working very well, but have some limitations).


So, before concluding my testing and sharing my findings, I’d be interested to know how (if at all) other Bubble devs have approached this challenge (what should be a very simple thing to do).

The solution must:

  1. Be truly scalable (i.e. work just as well on a database of 100k Content Items as with 100)
  2. Be reasonably performant (i.e. not take 15 seconds+ to return anything)
  3. Be reasonably WU efficient
  4. Preferably allow the User to select any number of Tags to use in the search.
  5. Preferably also display the total count of matching Content items on the page
  6. Use the Bubble database

I look forward to seeing some solutions :slight_smile:

7 Likes

Easiest problem ever , Design database as content - text , content-tag (list of text) , filter by content-tag list is empty or intersection list intersect with content-tag list count >0 , to be optimized first in basic search content is in content_list ( state of list of texts that holds only content ) .

User’s to be able to see and select thus you need to use regex to categorize when user selects content1 and tag11 then content 2 , later changes his mind and selects content1 and tag12 using regex to put to relevant box .

Same proccess you construct content_list similar to this way.

Hence , 1) Db design
2) Regex design - consturcting the intersect list & content list
3) Search implementation

I have implemented this in my work and it is efficient , however Adam I respect your work on WU’s & optimization . Love to see your approach

2 Likes

That sounds like a lot of advanced filters and regex to be running on 15,000 items in the DB which you don’t want to have to download entirely :slight_smile:

EIGHT? So this is what you’ve been doing since you stopped posting those challenges :sweat_smile:

4 Likes

1 advanced filter after handling basic search first + regex on the clientside while selecting

1 Like

Nice challenge @adamhholmes :slight_smile: and a genuine real world Bubble problem.

And I’ve never found a satisfactory solution. Messing around with concatenated fields and regex is kind of absurd in the year 2024. Building indexes ugghhh.

One time a while back >200k items - I used Algolia. Today I think I would Supabase it. #curious to see what lands. Point 6 noted :slight_smile:

I can already see the smart guys replying :rofl:

4 Likes

I’m looking forward your benchmark results on contains keyword(s) :slight_smile:

2 Likes

My 10 minute solution for the second problem:


etc up to the max number of tags you want to support. The limitation is editor maintainability, and having a limit on the number of tags based on how many constraints you include.

For the user, in performance terms, this is equivalent to an expression like Tags contains all of (List of tags). Of course, ignore empty constraints will get rid of most constraints when they normally only select a handful of tags.

Would I do it in a production app? Maybe, I certainly wouldn’t tell anyone about it or I’d hide in a corner :sweat_smile: :

I think the first problem is much harder (all content that contains at least one selected tag).

Yeah, I’ve found this is abysmal on searches with lots of text (e.g blog posts) to the point of not being useable.

3 Likes

I was just thinking about the tags being a single text field, not running it on the whole blog post.

2 Likes

İs this actually work ? I have tried constructing the selected in every scenario possible but it actually doesn’t work as intended

İs this performant ? I have 73.600 tags in my case

1 Like

Ah of course, good shout. Hadn’t thought about that as I never read too much into how contains keyword(s) actually works.

I think it doesn’t work any of this type of scenarios

I’m sure you can see how it can be used on one use case once you understand how the operator works

1 Like

If only you could programmatically add constraints…

is Data API allowed? :wink:

Yeah… anything requiring the use of advanced filtering is inherently NOT scalable (as it requires the full data to be returned from the database before it’s filtered).

This is the best I’ve come up with so far - and also a very similar approach for the first problem…

Although they both fail on the requirement to allow ANY number of tags to be selected by the User (although in practice that’s probably not a huge limitation) - and, as you point out, could also be tricky to manage from a dev point of view in the editor) - but they both perform the search in a very fast (by Bubble standards), and very WU efficient way.

So far (in my testing) the best (possibly the only) way to do it.

I’ve seen this method mentioned before and, although I haven’t tested it yet, if done correctly (it’s easy to do this wrong) it should work just as well for returning items that contain Any tag in a performant and WU efficient way.

I don’t think there’s any way to use it for returning items containing ALL tags though?..

3 Likes

Data API will solve the maintainability issue. Make a data API query, use :format as text to add constraint for each tag, get the list of unique IDs from API connector and then Do a search for Content where unique ID is in the data API response’s list of unique IDs.

Slightly more WU by virtue of one more search but infinite tags and very maintainable.

Now to have a go at problem 1…

1 Like

Is this works if data element Product has tags and contents and when user selects content1 and nothing else returns all with content1 and if it selects content2 and tag2 , returns all products with content1 and with tag2 ?

Quite the opposite: contains keyword(s) will get you the posts with ALLthe tags. doesn't contains keyword(s) can give you the posts with ANY tag, assuming you know the list fo all the existing tags.

There big problem is that the operator is english language dependant and will take into account plurals and discard “stop” words (see my previous link to my example post) according to english language.

The second problem is that I expect this operator to consume more resources than a normal contains, but I never had a proper benchmark on it.

EDIT:
doesn't contains keyword(s) works only if every post has a fixed number of tags, so this will only solve case 1

1 Like

I’ll include both of these in my testing.