Stuck on this: How to access thousands of DB records and put into JSON?

Hey all, I have an app where my customers have list of contacts (500 to 5,000 records). I want to allow them to send an email blast to them and to do this I need to convert the contacts to JSON and then send to Postmark (could be another service)

Currently I use the Toolbox plugin which loads that records and builds the JSON on page load, which seems to work alright for under 1,000 contacts, but it takes a long time for larger lists and uses a lot of capacity.

Any ideas on the most efficient way to find thousands of DB records, convert them to JSON and then send the JSON(s) in an API call?

1 Like

Are you doing this on the server side, or in the page? And, in doing this, are you doing it in a single workflow or doing it using ā€œworkflow on a listā€ or using recursive workflows?

(Because thereā€™s no reason you canā€™t do this in a single workflow, which would be the most obvious optimization, if you hadnā€™t done that yet.)

Even though Run JavaScript isnā€™t the most efficient thing in the world (because it uses eval()), the real performance bottleneck here is sourcing the Contacts (the Search that complies the list of Contacts).

I have more to say about this based on some recent experiments.

Andā€¦ actually, as I think about this, Run JS actually could be a very poor-performing way to do what youā€™re trying to do as you donā€™t really have proper ways to deal with Lists and Things, like you do inside of a plugin.

But, if you can answer my previous questions, I can provide more guidance.

I am flexible with all of it - end goal is just my User can click ā€˜Send Email to All Contactsā€™ and then my app sends JSONs of the contact data to an API - the JSONs array is limited to 500 contacts which might make it tricky and would require up to 10 api calls for 5,000 contacts.

I can do it server side or on page and I can do it anyway that makes sense from a tech standpoint.
On page (might) have the slight benefit of telling the user how many Contacts the email will send to before they click Send, but thatā€™s not a necessity.

Right now, I run the following workflows based on how many contacts are loaded on the page, which takes a lot of capacity to load 2,000+ contacts on the page, but I think the actual JSON creation isnā€™t too heavy an operation

There is a way to convert a massive list of Bubble Things to JSON Array.

Check out the attached video, please:
//s3.amazonaws.com/appforest_uf/f1596821530700x807607990335294600/things_to_json.mp4

Note: Ideally, you need to create a pagination workflow for huge lists.

1000 - takes ~3 seconds
5000 - takes ~20 seconds
10000 - takes ~45 seconds
100000 - takes ~1 min 41 seconds

It will take more time if you set more columns.

1 Like

Interesting, this may help in loading all the contact data. But I would still need to take those JSON values and reformat it to fit the Postmark API JSON structure - which I am sure there is a way, just need to find it

Using that way, you can change the output keysā€™ names, but it doesnā€™t support nested objects.
I can publish this plugin as a private one for now. Please DM your email, and Iā€™ll provide you access.

1 Like

Hey @gf_wolfer, how many fields does each JSON object have? Iā€™m assuming it might be like:

{ā€œnameā€: ā€œJoe Smithā€, ā€œemailā€: ā€œjoe@example.comā€}

but is there more to it? (Iā€™m asking because I want to create the exact scenario here.)

Off the top of my head I think Postmark requires an array that can have all the Template fields you set in your Postmark Template - so you can send a bulk email with custom info in it for every Contact (unique name, User links, etc)

So more like:

{ā€œtemplatemodelidā€: ā€œ123ā€, ā€œFromNameā€:ā€œKeithā€™s Emailā€, ā€œnameā€: ā€œJoe Smithā€, ā€œemailā€: ā€œjoe@example.comā€, ā€œmessagebodyā€:ā€œHello Joe, this is the email messageā€},
{ā€œtemplatemodelidā€: ā€œ123ā€, ā€œFromNameā€:ā€œKeithā€™s Emailā€, ā€œnameā€: ā€œFrank Doeā€, ā€œemailā€: ā€œfrank@example.comā€, ā€œmessagebodyā€:ā€œHello Frank, this is the email messageā€}

Ah, gotcha. Would you mind sharing the script you run in your Run JS with me?

(If u donā€™t want to post it here, you can DM it to me.)

BTW, I find on a personal plan that I can load and process 5000 contact-like objects in about 8-9 seconds (in a test app that isnā€™t, of course, doing lots of other things!).

How long is ā€œa long timeā€?

Itā€™s actually not the time to load as it is the load on the appā€™s capacity. This is live in production, overall the app runs well but you can see where I was ONE user going to this page to send an email (infrequent usage, but still). If multiple users use that feature at once it is timing out.

I am using the Toolbox plugin to convert a Data Type List to a List of Texts. Then magic is in this:
Screenshot_080720_032031_PM

But now that I just looked it up I am seeing errors from the inspector. So something recently broke it :scream:. It was working for 5 or so months before this. Edit - seems my test account for this somehow wasnā€™t working, still working for my Users

The xMESSAGEhere is some Find and Replace work that I then run when the workflow happens (there was a reason that I canā€™t remember right now)

List Item Expression?

Anyway, since youā€™re doing this stuff in the page, note that the only thing using capacity here is any Searches that compile the information needed to ultimately feed your scheduled API workflows that cause your emails to be sent.

And thereā€™s really no way around having to get the info on (what I assume) is your usersā€™ Contacts.

(And Iā€™m sure you know this, but just to be very clear, nothing that you do in the page ā€“ like running JavaScript, doing find and replace on text strings that are already in your page, using in page plugins, etc., etc. ā€“ consumes any capacity at all. Capacity is only server-side CPU usage.

Search operations, of course, are not ā€œin the pageā€ ā€“ they require interaction with the server and consume capacity. And of course your eventual API workflows that cause the emails to be sent are server-side, but these are probably not very CPU hungry as their just pinging Postmark.)

Now, itā€™s true that Searches that return a large number of items can noticeably consume capacity, as your log screenshot shows.

Hereā€™s an example from my own test app where Iā€™m loading large numbers of Things (in this case, an object called a Fave Thing, which has a Name text field with about 100 characters and a Price number field on it). You can see a big difference between a Search where we grab the first 5000 items versus a Search where we grab the first 30,000 items (and timewise, loading is similarly scaled ā€“ 5000 items load in around 8-12 seconds typically, but the 30000 items can take anywhere from about 80 to 120 seconds:

So the only thing you can really do to improve the performance of your page is to improve the source Search or Searches and (to a lesser extent) control the amount of data that gets fetched to the page.

A couple of things to think about there (Iā€™m not sure that any of these might help as I suspect) :

  1. Is the source for the info you need to send your emails sourced from just a single object (e.g., a Contact)? Or do you have to do multiple large searches for different objects? (Iā€™m guessing from you graph that thereā€™s really just one type of thing being fetched. But if youā€™re having to Search across multiple Things to assemble this information, it would probably be better to consolidate those data fields onto a single Thing. Again, I donā€™t think this is an issue for you, but FYI.)

  2. Does that object have have a bunch of stuff on it that you donā€™t need for sending these emails? We obviously need the recipient name and their email address. But does the Contact object have a bunch of other stuff like a geographic address, an image, a notes field that might have several K of text on it, list type fields with a bunch of other stuff, etc.?

Note that this shouldnā€™t really be a problem, but could be. (In most cases, if we donā€™t reference a Thingā€™s field, we donā€™t fetch the value in that field and we donā€™t cause a database interaction.) BUT if the Contact object is complex and has many fields, it may be possible that you may get better performance by ā€“ instead of Searching for and returning the Thing ā€“ you instead Search for the Thingā€™s fields that you need. Instead of (for example):

Search for Contacts

Try Search for Contact's Name and Search for Contact's Email and then youā€™ve got parallel lists of texts (what this does is destructures the Thing on the server). This would usually be something I wouldnā€™t advise, but my testing shows that, done in quick succession, this takes pretty much the same time as Searching for the Thing itself and then destructuring in the page.
But Iā€™ve only tested this on very large numbers of Things where the Thing is simple (as in my example above where Iā€™ve got a Fave Thing with only two fields on it. If the Thing is more complex and has a lot more fields the ā€œdestructure on the serverā€ approach may be faster. Itā€™s not clear to me whether this uses more capacity or not.

Itā€™s worth experimenting with. Hereā€™s something I tried. Iā€™ve got these Fave Thing objects and I tried the destructure-on-server approach (where we simultaneously Search for Fave Thingā€™s Name and Search for Fave Thingsā€™s Price) 3 times and the Search directly for Things approach three times over 20,000 items. Here are the results:

destructured 1: 39 s to return Names + 9 addā€™l ms to return Prices
destructured 2: 40 s to return Names + 9 ms addā€™l ms to return
destructured 3: 50 s + 15 ms

search for Things 1: 30 s to return Fave Things + 8 ms to destructure into Names + 7 ms to destructure into Prices
search for Things 2: 31 s + 7 ms + 6 ms
search for Things 3: 33 s + 15 ms + 6 ms

Directly searching for the Things (and then destructuring in the page) does look like it was generally faster. However, itā€™s possible that the capacity impact was greater. Hereā€™s what I see in my capacity chart:

Kind of interesting, right? (Because of the granularity here itā€™s hard to tell is the simultaneous searches for field values are using less instantaneous capacity, but spread out over a longer timeframe, or if indeed they impact capacity slightly less.)

  1. One thing for sure: Since loading (doing the Search for) Contacts is so impactful to capacity, you should not do this automatically on Page Load. What if your user visits the page accidentally? What if they are just futzing around and have no intention of actually sending the email now?

You donā€™t need to fetch the Contacts until the user commits to sending the email, right? So only load it then. Donā€™t waste capacity that you donā€™t absolutely have to use.

Another suggestion would be, to make this experimentation easier, you might want to install List Shifter and use that as your combination data loader and storage. Yes, you can use a headless repeating group or a custom state, but List Shifter is just a lot more convenient (and also it throws an Initialized/Updated event when itā€™s finished loading, so it makes benchmarking more reliable). Additionally, even if you donā€™t want to use List Shifter, this plugin includes my Debug Buddy action plugin that allows you to benchmark the time between different events (and this is how Iā€™m able to tell you above how long such-and-such an operation takes).

  1. Iā€™m actually working on a server side version of List Shifterā€™s Process List action, which would give another route for doing stuff like this which might be lower-impact capacity-wise. (Even on hobby accounts, it seems to be able to import a list of about 13,500 and you get a few thousand more on a Personal plan.) For 5000 Things, it takes about 20 seconds to receive the input list and then about 6.8 seconds to do a basic process on the list (like transform that list into something JSON like). For 10,000 Things time to execution is about 50 seconds and about 12 seconds to run a simple process on those 10K items.

(Which just shows us again how stingy Bubble is with compute, right? The same processing part takes just milliseconds on a decent PC in the browser.) But peak capacity impact is quite a bit lower:


:point_up: the part on the right is me running a 5K followed by 10K Server Side Process List action.

Anyway, the point of that plugin is that you could do iterative stuff in backend workflows and, even though they are slower, for things like, ā€œHey, take this list of Contacts, process it and then pass this off to my email sending serviceā€ itā€™s fine.

At the end of the day, @gf_wolfer, I guess my point is that (1) even if your on-page stuff is written sub-optimally, that doesnā€™t impact capacity and even a slow/poorly-written script will execute orders of magnitude faster than the time it takes to actually return the results of a search for 1000, 5000, 10,000 items.

So I think the ultimate solution is ā€œbuy more capacityā€ā€¦ But as I have been fond of asking, ā€œExactly how much capacity do I have to buy before my Bubble compute feels like itā€™s running on a Raspberry Pi versus a Casio digital watch circa 1986?ā€ :woman_shrugging:

1 Like

As usual, your level of detail and curiousity about these type of questions is amazing - thanks for the deep dive into this.

To answer a few questions.
Yes this is the List Item Expression

Yes the main annoyance is having to load 500 - 5,000 Contact data records before performing this operation, so thanks for the detail into different ways into evaluating the load time for this. I think being able to do it server side eventually would be what I want to do so the user experience is seamless and it only loads the Contact data when an Email Blast is actually ā€˜sentā€™

1 Like

This topic was automatically closed after 70 days. New replies are no longer allowed.