Creating a list longer than 10.000 records

Hi there,

I am creating a Social Platform and are wondering what’s the best way to save likes to the database.

One way would be to store likes, as a “list of Users” to each “post”. However the problem arises when the list hits 10.000 records - as no list in bubble can be longer than that.

Another option would be to just create a Thing called “Likes” with the fields “Liked by - User” & “Liked Post” and then perform searches all the time. However as the database grows I believe these searches could potentially be very heavy if it the server needs to go through all “Likes” just to find the ones that apply to this specific post.

Can anyone point me in the right direction?

Thanks in advance,
Manuel

Hi there, @maze… I’m sure someone will correct me if I am wrong, but I believe the prevailing wisdom these days says that lists of over 100 items are not ideal from a performance perspective. Also, the same wisdom says you shouldn’t shy away from performing searches through lots of things to get what you need.

The above being said, I think it’s more about what you are trying to accomplish that will define the “right” direction for you. I mean, if you went with the option of creating a new data type called Likes, and you are going to search through that data type to get a count of the likes for a particular post, then you might decide to create a like count field on the post and increment the count each time a post is liked. So, again, what you are trying to accomplish is almost as important as how you accomplish it because the former will often dictate the latter.

Anyway, just some food for thought, and I hope it helps.

Best…
Mike

1 Like

Hi @mikeloc .

Thanks for your answer.

Indeed I am having a field “Like Count”. It is more about displaying all the users who have liked the post in a RG.

Well, admittedly, I am not a social media user, and I can’t understand for the life of me how it would be valuable for anyone to sift through a repeating group of thousands of users who have liked a post. So, not overlapping even the tiniest bit with your target audience, I will shut up now. :slight_smile:

2 Likes

As far as I understand it (and I might be wrong about this), there’s nothing inherently wrong with having large lists, even thousands of items (up to the hard limit of 10,000)…

The issue is more to do with loading those lists unnecessarily, causing slow performance.

For example, if you have a List field directly on the User datatype for ‘likes’, and a user has 9000 likes on that list, then any time you’re loading that User data (for example on a profile page) you’re loading that list of 9000 items (although, as far as I understand, that list is just a list of strings of the unique IDs of the ‘likes’ - not any of the actual Like data, unless you’re referring to the data of the likes).

That means you’re loading a lot of data unnecessarily, slowing down the load speed of the Profile page for no reason.

And if you’re loading a repeating group of 100 Users, and each User has 9000 likes on their like list, that’s 900,000 likes being loaded on the page, for no reason at all, which won’t be good for performance.

So having large lists on datatypes that you’ll be loading a lot for other reasons is a bad idea, and as @mikeloc points out, for most practical purposes, lists of more than 100 (or even a few dozen) are not recommended if they exist on datatypes you’ll be loading a lot.

However, I can’t see any issue with having large lists on datatypes that you won’t be loading unless you need them…

So, an alternative method to consider (one which I use quite a lot - although I’ve never really tested it for performance with very large lists), would be to create a new datatype, just to contain the list.

For example, in your case you could create a datatype called User_Like_List, which has a field for User and a list field for likes.

As you’ll never be loading that datatype unless you’re specifically looking for a User’s likes, there’s no reason why having 10,000 items on that list should ever be an issue, and in theory it should be easier/quicker to access them, than searching the database for likes connected to the User.

If you need more than 10,000 just create another User_Like_list for the User.

You could add the User_like_lists to a list field on the User datatype - that way, even if a User has 100,000 likes, the only thing you’ll have on the User datatype is a list of 10 unique IDs for the User_Like_Lists (which won’t be a problem at all).

If you need to access the likes for a User you can just refer to the User’s Like_Lists each items likes.

Or, you might not store the lists on the User datatype at all - just search for User_Like_lists for the User (which would just return a list of 10 items) then you can access each item’s likes from there.

As I said, I’ve never actually tested this approach with very large lists (only a few hundred to a thousand), so I can’t say for certain how it performs at scale compared to just having individual likes and searching for 100,000 of them from the database.

But, at least on the smaller scale it works well to keep each datatype light, avoiding loading unnecessary data, whilst making it easy to access any of the ‘nested’ data I need in a simple and efficient way.

8 Likes

Thanks for the extensive answer! :raised_hands:

As you pointed out it is very important to keep long lists separate from the main Datatype (if they appear in searches and RGs).

I think I am going to create a new thing called “Like” and attach a field “post”. Then I am going to run a search for “like” - where “post = current post” which should return all likes of that post.

That way I do not need to keep a list updated and also do not have to deal with the 10.000 limit. Looking forward to see how fast Bubble can find things in the middle of a lot of “likes”.

PS: I also tried to keep the data type to a minimum (just added 1 field) to make the search faster…

Again thanks for your help @mikeloc & @adamhholmes :slight_smile:

2 Likes

For anyone reading this today, another solution to your problem (wanting to know who liked a post, and also have a count higher than 10 000 at the same time) is a text as a list. Inside a text field called likes , you can have user ID’s of all the likers delimited by commas:

user1, user2, user3, …

Then you just use really simple regex operations to know how many people liked it. Everytime someone likes the post, you add their user id (unique id). This is not bad on performance, since regex is really lightweight.

*Note: I don’t know how long a text can be in Bubble! take this with a grain of salt :slight_smile:

4 Likes

Fun Fact:

I was doing some performance testing on a plugin recently, and I find that list fields on a Thing can now hold up to 100,000 items. At least, this is true of numeric and text lists. I suspect it’s also true for other data types (including Things), but it’s difficult and slow to create such a large list of Things.

6 Likes

What’s your opinion on regex solution in terms of performance?

@jonah.deleseleuc using regex or just :split doesn’t take long at all in the page. Getting the data to the page (for long texts) will generally take much longer than processing it once it arrives. (Splitting even a very long string would be only a couple ms.)

But as you note, surely there’s some limit to the length of text fields (I guess that’s fairly easily testable to find the limit, but I don’t personally know what it is).

3 Likes