Prepping The Database For a Large Group Chat


I am a beginner Bubbler (about 1 month of experience now) and am working on an app that will have a group chat functionality. I would like it to be ready to handle groups of at least 50-100 users each and be ready to handle conversations with thousands of messages.

I’ve purchased some paid messaging templates (and checked out some free ones as well) to get some ideas on how to tackle this, and am noticing a pattern: that the general approach to store group chats in the data tab is to create “lists” of messages, and then output them using a repeating group:

I’m just not completely sure if this approach is scalable, or meant for much smaller group chats.
Do you think this is the most scalable approach?
If not, how would you go about structuring the data for larger groups?

Thank you in advance!

I haven’t built a chat app… but I share your concern with the approach of storing a discussion’s messages in a list field in the discussion datatype.

I would do it the other way around… in the messages datatype, have a field that connects it with a specific discussion.

1 Like

“List of Messageses” :rofl:

But, in all seriousness, if a Message is a Thing (a custom datatype) there’s no harm in a “Discussion” having a field that’s a List of type “Message” on it.

The converse (a Message has a List of type “Discussion” on it) is only useful if a Message might belong to multiple Discussions… which strikes me as unlikely.

Again, in all seriousness, Scalar Things should be named in the singular (e.g., the Thing is a “Message”).

If you then have a field of multiple “Messages” on it (that is, a List of Messages), it’s OK to call that field “Messages”, but it might be better to name that field “Message List”.

1 Like

Consider what happens if we now construct a List of Discussions. If a Discussion has a field called “Messages”, we now see the List’s (of Discussions) Messages as a List of Discussion’s Messageses

If, instead, a Discussion has a “Message List” on it, we will now see the List’s Messages as the List’s “Message Lists”… which is much more helpful.

Ah, but what about nouns that can be both singular and plural :slight_smile:

@keith is right, of course, about the list vs point to “parent”.

The point here is really not about data, but about the presentation of that data. Big lists are absolutely fine (but there is a limit of 10,000 - that is a long ol’ convo). No, really, absolutely fine.

What is not fine is what happens when you come to a list that long and want to display … “my conversation history on this discussion with Amy last week”. That gets really slow, and arguably it is easier to do it by having pointers back the other way.

But…at scale that is also pretty bad unless you can really narrow down the search at the database. No filters, and absolutely no advanced filters.

The question of “how big a list is too big” generates quite a lot of discussion. Some say 100 (and that is not a bad place) personally I start to worry at around 500. Depends on the “weight” of the data in the list maybe.

So…as a way to scale this…

Recent Messages (list of messages) - remove items from list when count > 100
Older Messages (list of messages) - and add them here

It is likely that 99% of the time you will be displaying messages in your app most recent first. So structure your data to what you do 99% of the time. Not someone wanting to go back in time.

Then you can have a “show more” button at the bottom of the list, which may have a slight delay, but then people are probably aware that older stuff takes time.

1 Like

yeah I thought that was funny too when I saw it, but to cut the template dev’s some slack, they are based in Europe and I figure English isn’t their primary language.

Thank you for the naming convention info and tip! I’ll keep that in mind.

Thanks for your input! You put into words my suspicion.

Thank you for your input on how to make the “list of messages” approach scalable. What do you think about the idea of not even using lists, and just assigning each a message a unique group id and an “output all messages with group id, sort by most recent” approach. Does that seem less efficient to you?
(Also if you happen to have a youtube channel about bubble, send me the link and I will subscribe)

I suggest thinking about what kind of sorting/filtering you plan, and then making sure you understand how that functionality will work on a large set of messages.

With a list field, my understanding is that if you want to filter it, that is happening client side, meaning Bubble is downloading lots of records (in the background) to the browser and then filtering it there to find enough results to fill your repeating group. In that situation I would worry about discussions where thousands of messages are being downloaded to a user’s phone, which would be a slow and laggy experience.

If you linked the other way (each message is connected to a discussion, a many-to-one relationship), the filtering can happen server side via a regular “do a search for” (very fast) and it will only send to the browser the results.

(PS: If you haven’t yet read this book for info on how Bubble works behind the scenes, I highly recommend it: The Ultimate Guide to Bubble Performance - the new edition is out (now 210 pages!))

Thank you for this info! Very very helpful, and I just purchased that book!

But you will be searching the WHOLE database each time. Which will just get slower and slower.

Not saying it is a bad idea (it is what I would do) but there are alternatives using controlled lists.

Yes, but I wonder where that line is in terms of when we’d see any identifiable slowdown from a purely sever side search. From what I’ve read, pure server-side searches are very fast/scalable, given server power, indexing, and other stuff databases do to handle huge amounts of data. But if it’s a client side search, different story as we know… it will start bogging down noticeably with small data sets.

If a list wasn’t too long and the user will be searching/sorting it – there’s a good argument to be made to go ahead and let Bubble download all the entries to the browser and then filter client side from there. It will be snappier. But if it’s a very long list, then I think it will be problematic for the reasons discussed.

Yeah, it is not simple. I do think that Bubble should implement “views” so that you can bring back subsets of data in lists to work on.