Prepping The Database For a Large Group Chat

marcusgonzales · January 26, 2022, 8:27pm

Hello,

I am a beginner Bubbler (about 1 month of experience now) and am working on an app that will have a group chat functionality. I would like it to be ready to handle groups of at least 50-100 users each and be ready to handle conversations with thousands of messages.

I’ve purchased some paid messaging templates (and checked out some free ones as well) to get some ideas on how to tackle this, and am noticing a pattern: that the general approach to store group chats in the data tab is to create “lists” of messages, and then output them using a repeating group:

I’m just not completely sure if this approach is scalable, or meant for much smaller group chats.
Do you think this is the most scalable approach?
If not, how would you go about structuring the data for larger groups?

Thank you in advance!

ed727 · January 27, 2022, 1:29am

I haven’t built a chat app… but I share your concern with the approach of storing a discussion’s messages in a list field in the discussion datatype.

I would do it the other way around… in the messages datatype, have a field that connects it with a specific discussion.

keith · January 27, 2022, 7:06am

“List of Messageses”

But, in all seriousness, if a Message is a Thing (a custom datatype) there’s no harm in a “Discussion” having a field that’s a List of type “Message” on it.

The converse (a Message has a List of type “Discussion” on it) is only useful if a Message might belong to multiple Discussions… which strikes me as unlikely.

keith · January 27, 2022, 7:12am

Again, in all seriousness, Scalar Things should be named in the singular (e.g., the Thing is a “Message”).

If you then have a field of multiple “Messages” on it (that is, a List of Messages), it’s OK to call that field “Messages”, but it might be better to name that field “Message List”.

keith · January 27, 2022, 7:44am

Consider what happens if we now construct a List of Discussions. If a Discussion has a field called “Messages”, we now see the List’s (of Discussions) Messages as a List of Discussion’s Messageses

If, instead, a Discussion has a “Message List” on it, we will now see the List’s Messages as the List’s “Message Lists”… which is much more helpful.

NigelG · January 27, 2022, 10:28am

Ah, but what about nouns that can be both singular and plural

@keith is right, of course, about the list vs point to “parent”.

The point here is really not about data, but about the presentation of that data. Big lists are absolutely fine (but there is a limit of 10,000 - that is a long ol’ convo). No, really, absolutely fine.

What is not fine is what happens when you come to a list that long and want to display … “my conversation history on this discussion with Amy last week”. That gets really slow, and arguably it is easier to do it by having pointers back the other way.

But…at scale that is also pretty bad unless you can really narrow down the search at the database. No filters, and absolutely no advanced filters.

The question of “how big a list is too big” generates quite a lot of discussion. Some say 100 (and that is not a bad place) personally I start to worry at around 500. Depends on the “weight” of the data in the list maybe.

So…as a way to scale this…

Recent Messages (list of messages) - remove items from list when count > 100
Older Messages (list of messages) - and add them here

It is likely that 99% of the time you will be displaying messages in your app most recent first. So structure your data to what you do 99% of the time. Not someone wanting to go back in time.

Then you can have a “show more” button at the bottom of the list, which may have a slight delay, but then people are probably aware that older stuff takes time.

marcusgonzales · January 27, 2022, 10:37pm

yeah I thought that was funny too when I saw it, but to cut the template dev’s some slack, they are based in Europe and I figure English isn’t their primary language.

marcusgonzales · January 27, 2022, 10:38pm

Thank you for the naming convention info and tip! I’ll keep that in mind.

marcusgonzales · January 27, 2022, 10:42pm

Thanks for your input! You put into words my suspicion.

marcusgonzales · January 27, 2022, 11:04pm

Thank you for your input on how to make the “list of messages” approach scalable. What do you think about the idea of not even using lists, and just assigning each a message a unique group id and an “output all messages with group id, sort by most recent” approach. Does that seem less efficient to you?
(Also if you happen to have a youtube channel about bubble, send me the link and I will subscribe)

ed727 · January 28, 2022, 2:26pm

I suggest thinking about what kind of sorting/filtering you plan, and then making sure you understand how that functionality will work on a large set of messages.

With a list field, my understanding is that if you want to filter it, that is happening client side, meaning Bubble is downloading lots of records (in the background) to the browser and then filtering it there to find enough results to fill your repeating group. In that situation I would worry about discussions where thousands of messages are being downloaded to a user’s phone, which would be a slow and laggy experience.

If you linked the other way (each message is connected to a discussion, a many-to-one relationship), the filtering can happen server side via a regular “do a search for” (very fast) and it will only send to the browser the results.

(PS: If you haven’t yet read this book for info on how Bubble works behind the scenes, I highly recommend it: The Ultimate Guide to Bubble Performance - the new edition is out (now 210 pages!))

marcusgonzales · January 28, 2022, 7:20pm

Thank you for this info! Very very helpful, and I just purchased that book!

NigelG · January 30, 2022, 10:12am

But you will be searching the WHOLE database each time. Which will just get slower and slower.

Not saying it is a bad idea (it is what I would do) but there are alternatives using controlled lists.

ed727 · January 30, 2022, 9:37pm

Yes, but I wonder where that line is in terms of when we’d see any identifiable slowdown from a purely sever side search. From what I’ve read, pure server-side searches are very fast/scalable, given server power, indexing, and other stuff databases do to handle huge amounts of data. But if it’s a client side search, different story as we know… it will start bogging down noticeably with small data sets.

If a list wasn’t too long and the user will be searching/sorting it – there’s a good argument to be made to go ahead and let Bubble download all the entries to the browser and then filter client side from there. It will be snappier. But if it’s a very long list, then I think it will be problematic for the reasons discussed.

NigelG · January 31, 2022, 11:07am

Yeah, it is not simple. I do think that Bubble should implement “views” so that you can bring back subsets of data in lists to work on.

Topic		Replies	Views
Best chat data structure practices Database	3	540	December 17, 2023
Data Type help needed Database	8	355	March 9, 2023
Database concept for messaging function Questions	2	743	November 20, 2019
Structuring Message/Messenger Data Database	3	898	August 29, 2017
Public chat for a large number of users Database	2	273	September 10, 2021

Prepping The Database For a Large Group Chat

Related topics