Advice/guidance needed for report building

aiancu · December 3, 2022, 5:38pm

Hi Bubblers!

Coming to you all for some advice request regarding building out reports for my app.

TL;DR - complicated data structure that needs to summed up, running into “app too busy” errors.

Preface: I’m building a management tool for race teams. Starting out with a parts inventory.
The relevant Things are: Part, Part Movement, Part Category, Part Prices, and Applications.
Here’s a simplified diagram of the Things and their links:

Any constructive criticism is most welcome about this setup!

Here’s the logic for the setup:

Each part can be in multiple categories
Each part can be applicable to one or more manufacturers, models, or car types
(for example, a windshield wiper might be applicable across the board, whereas a steering wheel will be specific to a single manufacturer, model and car type)
Each part can have multiple prices across time (a price for a part might go up or down), or depending of it’s condition (for example, a used engine with 10k miles will have a different price than a brand new engine or one with 5k miles, but they remain the same part)
To track inventory as well as services done on a car, any new entry is done as a Part Movement.
- Ordered a new part and adding it to inventory? Add a new part movement with +1 for Qty. Choose a price linked to that part and the values copy to the part movement (to keep historical records)
- Used a part in a car service? Add a new part movement with -1 for Qty. Choose a price linked to that part and the values copy to the part movement (to keep historical records)

This is where we get to the reporting!
First, here’s a sample of a report:

We need to see a summary of the total cost and retail value of all parts.
Let’s keep in mind that we’ll need to filter the report as well (i.e. Generating a report that contains parts for only a specific vehicle).

For each part, we need to search for it’s Part Movements.

Drilling down further, for each Part Movement we need to calculate the following for new, used and total:

Step 1

Qty * Cost (Total Cost)
Qty * Retail (Total Retail)

Now that we have the Cost and Retail for each Part Movement, we move back up to the Part level.

Step 2
For each Part, we need to sum the following for all of it’s Part Movements:

Total Qty Used
Total Qty New
Total Qty
Avg Cost
Avg Retail
Total Used Cost (sum of filtered Step 1)
Total Used Retail (sum of filtered Step 1)
Total New Cost (sum of filtered Step 1)
Total New Retail (sum of filtered Step 1)
Total Cost (sum of Step 1)
Total Retail (sum of Step 1)

My main concern at this point is that I haven’t found a way to catch if the workflows below fail and deal with it accordingly.

My gut also tells me there’s a much better way to do this, but I can’t for the life of me figure it out.

I did test generating these values and some of the totals as each Thing is created or modified using backend triggers.
It seems these triggers are only somewhat reliable. It’s happened that triggers just didn’t run, and the report ended up being off since the “total” fields were blank. It seems safer to generate the values on request

If any of you have any advice, it would be much appreciated, and thank you for reading this whole novel! Hopefully this proves useful for others as well.

Here’s what I’ve done so far, doing hours of testing and failing.

Test: Repeating group of parts that sums and filters as required directly on the front end.
Text elements with many “search for” and advanced filters to account for filtering by items in a list that can’t be done with a regular search filter (i.e. User selects filter by Manuf 1 and Manuf 2, need an advanced filter to intersect those values with the values in the Part’s Applicable manufacturers).
Result: Way too slow, page starts timing out and getting Chrome errors about unresponsiveness when a report contains anything over 100 parts.
While this always ended up displaying the data, the fields became unmanageable very quickly, and it’s terrible UX to have Chrome say the page is unresponsive.

Test: Create a “Report” and “Report Line items” Thing. Two backend workflows that are passed the relevant filters as parameters.

image829×386 15 KB

The workflow does a search for all the Part Movements (filtered by all, new parts only, or used parts only), then take’s “each Part Movement’s Part”, unique elements, and filters them by the parameters provided. Again, we need to use some advanced filters here to account for intersecting lists which obviously slows things down.

image701×707 30.1 KB

The result of this search is the list of Parts we need, that is “set list” to the Report Thing.
Now we trigger a second, self-iterating, workflow to run Step 1 and Step 2 above for each Part.
To this workflow, we pass the Report’s “Parts” first item, and a Search for all of this Part’s Part Movement.

image700×409 18.4 KB

This second workflow creates a new Line Item for the report, and sets all the relevant fields using the provided parameters.

image703×388 18.5 KB

(this screenshot is actually from the test below, so in this test I was settings all the fields in one step but the screenshot only shows a couple fields being set)
In a second step, the workflow removes this part from the Report’s field (to prepare the next iteration for “report’s Part:first item”, and adds this line item to the Report’s Line Items field.
Finally, once the Report’s Part field is empty and all our line items are calculated, we have one more step which sums up all the line items to the Grand Totals on the Report.
Result: This has been working, as long as only one report is launched at a time. As soon as I try to generate two reports simultaneously, I get an “app too busy” error. This is also fairly time consuming (~15 minutes for 500 parts) but at 2 seconds per Part that’s not unreasonable. I could conceivably create a queue for report generating, to ensure that no two reports are run at the same time across users.

Test: Same as the second step, but instead of having a single backend workflow to set all the line items totals at once, I split it up in multiple workflows as per the screenshot above. This slows the generating down a bit more, but seems to lead to less “app to busy errors”
Result: I was able to run two reports simultaneously sometimes, but still run into issues of “app too busy” on certain occasions if I do so.

boston85719 · December 3, 2022, 5:41pm

I didn’t read anything you posted honestly after the first couple of lines, but I presume you may benefit from something I came across in my Stripe dashboard today.

When looking to ‘reports’ in Stripe they have a nice little loader with a message that says building larger reports may take a few minutes.

I’m just going out on a limb without reading all the details and say you may want to try to put up a loader screen before the report is ready.

aiancu · December 3, 2022, 5:46pm

Thanks! Stripe was definitely an inspiration for the Things setup, and I do have a loader but the issue is not the time it takes for the report to generate, it’s that the backend workflows either fails, or the page displays a browser error saying the page is unresponsive if I try to generate everything on the front end. It eventually does load if I click “wait for page”, but that’s very poor UX and can’t really tell users to just ignore browser errors like that, whether a loader screen is present or not

boston85719 · December 3, 2022, 5:53pm

I would do it in a backend workflow. If the data can be, I’d set it up as a recursive backend workflow.

I usually have a data type I call ‘processor’. Has a field ‘processed’ as a yes/no. When I send data to the backend to be processed that once processed, the user would have a page element change of some kind (ie: not a backend for them to leave the page, but a backend data process in which the user journey requires the completion of the process) I first create a new ‘processor’ then in the backend workflow have a parameter for ‘processor’ and when scheduling the backend workflow send the ‘processor’ and on the page have a custom workflow which you would use the ‘when data changes’ to trigger the custom workflow. The custom workflow thing is the ‘processor’.

In the backend workflow, when I do with recursive, after the recursiveness is done, I make changes to ‘processor’ and flip from no to yes…then on the page the custom workflow will trigger and do what I want it to, which is usually change a view or navigate to an external site.

Maybe the concept could be applied to your issue.

aiancu · December 3, 2022, 5:58pm

Interesting, I’ve used a similar concept with flipping a “done” field that sounds like what you’re doing, I didn’t think of using that to check for failed workflows. I am using backend workflows currently, and moreso they’re split into 8 different ones to try and break up the process and avoid time outs.

Bubble doesn’t seem to provide an option to capture timeouts or app too busy errors, but I guess I could have a wf check if a particular report is not set to completed within 60 minutes to retry, or set that field to error instead so it can be dealt with by the user after the fact

aiancu · December 4, 2022, 11:17pm

I came across a potential bug while testing all this out and calling a PDF API through backend workflow - made a new post in the relevant topic here.

sem · December 15, 2022, 11:04pm

Any progress on your challenge?

From my own tests I noticed that Bubble is very, very slow when you have to deal with lots of data and computations and filtering with that data.

Using an external db means that 1-2 seconds time will be added to process the call in my case. Plus you can only do like 200 items at once.

So any experiences you encounter while solving this is most welcome!

aiancu · December 16, 2022, 3:24am

Yea! What surprised me a bit was that I’m not even dealing with that many records. 4-500? It’s a good bit, but a far cry from the thousands or even millions of “things” some Bubble users seem to deal with.

The biggest thing slowing it down seems to be the advanced filters, which makes sense (having to download all records, 2-3k in this case) then filtering down. With my data structure (a lot of many to many relationships) I don’t think I had much choice.

I did run into a bug here that I struggled with for a while, which was related to deleting parameters from the back-end workflow (BEWF) without clearing them first (see my previous post).
With that fixed though (by recreating all the relevant workflows from scratch), I managed to reduce any single workflow to being far below the limit where it would time out, and it’s especially very snappy on the front-end.

Most importantly though, if I have to pass data to workflows, I only use unique IDs and then search for that Unique ID as granularly as possible This is what has made it the fastest on the front-end. If I try to use the entire “Part” rather than just a unique ID, the delay was going from a fraction of a second to 15-20. Dangerously close to a time-out. I also noticed that having a repeating group (set to 1px by 1px) was seemingly the fastest way to pull up all the records I need and reuse them.

That being said, the recursive nature of the workflows means the report generation is far from instantaneous, but we can deal with that by showing the user a % completion, or sending by email, etc.
To give you an idea, generating a full report, with 504 parts and 547 part movements, the total time to generate the report (504 workflows) was roughly 7 minutes:

Not fast, but within reason I think given the data structure.
I’m sure that if I could have a “flatter” data structure, this could be improved significantly!

Ready for a long read? Here’s my workflows:

Detailed Workflows

First, the user sets up his report with a report builder:

Note that most of these filters are a list of things, so we’ll be forced to use some Advanced filters since we have some many to many relationships (a part may be applicable to multiple models, and models can be applicable to multiple parts)

The report count above the button gives a quick indication of how many parts the report will have as a “double check” that our filters are what we’re looking for. Doing a search, even with advanced filters, but telling it :count at the end seems to be super quick.

Front-end Workflow
Once we click Generate Report, it launches the following workflow:

In the first step, we use the same search expression we used for the :count (but without the count of course. Here it is below:

In my case, since the Part Movements track whether the quantity added to inventory was New or Used, we first search for Part movements, with a constraint for Qty > 0 and Condition = value from the “Type” dropdown the user selected in the “builder”. Since a Part Movement can only have one Part, we take the result of that first search, list of all the Parts (each item’s Part), keep only the unique ones. We then filter these Parts further to apply the other filters selected by the user. The new/used/all filter works with Part Movements, but the remaining filters work with the Parts themselves.

Note about the filters

(This example might be a bit confusing because a part can have a Manufacturer - the manufacturer of the part, but also an Applicable Manufacturer, i.e. the Manufacturer of the car that the car can be used with. The Part Manufacturer is a many to one link - each part can only have one Manufacturer, the Applicable Manufacturer is many to many - a part can have multiple applications, and each application can be appleid to multiple parts.)

Again, this is the same expression as the count that’s displayed to the user, without the :count at the end. Instead of the count though, we keep only the Unique IDs of these filtered parts. This is super important, because displaying the Part itself slows it down significantly - probably because it’s a lot more data for Bubble to load in rather than just the unique IDs as text.

We use all of this first step to display this result in a 1px by 1px repeating list (rp_worker_reportParts), which seems to load significantly faster than just applying those unique IDs the Report thing directly.
Once that’s done, we create the Report thing (step 2) and give the report a name.
In Step 3, we “make changes” to the report (created in step 2), and use the Report’s Parts field (text) to set it to rp_worker_reportParts list of texts. Note here that the Parts field is a simple text field, not a list of texts. So we now have a Report thing, with it’s Parts field set to a whole bunch of unique ID’s as text, that are separated by a comma.

Step 4 (which I wont’ get into too many details here unless you need it) add the generation of the report to a queue, so no two report generations are running simultaneously. This is mostly as a safeguard in case multiple users try to run really large reports to ensure we don’t run too many workflows simultaneously and risk timing out.
The queue is a Thing called ReportsQueue with the ID of the report, a scheduled time, start time and end time. Maybe it would work better on a higher tier plan, but for now I’d rather apply this restriction. With some backend triggers, the reports launch one after the other, and indicate to the user which position in line their report is across all users.

Back-end Workflows
Workflow 1, setup
Once the report is launched from the queue, the first of three backend workflows is triggered.

Step 1 and 2 are some simple data changes to set the start time of the generation (mostly for debugging purposes), and in step 2 to set the # of workflows we’ll run so we can show the user a % completed.

In step 3, we trigger the second workflow, which is the recursive workflow.

We pass as parameters: the report’s unique ID, the first part from the Report’s Parts field (split by the comma), and the type of report (as a helper to filter the Part Movements easily later on). Remember that the part we’re passing is only the Unique ID! to keep the speed up. Text is faster to pass through than a whole Part thing.

Workflow 2, creating the report’s line items recursively

The second, recursive workflow, works on a single part at a time.
Step 1, we create a new Thing called Report Line Items. We link the Line Item to the report and Part by using simply “Search for” and use the Unique ID’s we passed as parameters.

Step 2, we do a “search for” all the Part Movements relevant to the part we’re working with. We only want Part Movements that have some quantity, we filter the condition (option set filtered to match the type parameters we passed), and constrain to the part that we’re working with (for readability, we don’t do another search, just use the result of Step 1’s, the line item Thing’s, part - I didn’t see much speed difference here in doing another search vs getting it from the Thing although both were tested.)

In Step 3, we set all of Line Item’s fields to the sums and calculations we need from the part movements we gathered in Step 2. Again, you’ll see some “filtered” here depending on the field we’re setting because we’re getting the Part Movements from what we set on the Line Item thing rather than search for them. The speed difference between the two approaches was negligible here as well, probably because we’re dealing with a fairly short list of Part Movements for any one part. If you had more than 20-30 and into the 100’s, a search would probably be better.

In Step 4, we add this Line Item to the Report’s Line Item’s field (List of Line Items) and we remove the part we just worked on from the Report’s Parts field

In Step 5, if we still have Parts in the Report’s Parts field, we repeat the second workflow with, again, the first Part in that list (which is now the second overall, but we already removed the first in step 4!)

If we don’t have any more parts, Step 6 triggers instead of 5, and we trigger the third workflow.

Third Workflow
Workflow 3 finalizes the report.

In step 1, we set the grand totals of the Report by doing a whole bunch of Sums on the correct fields from all the Report’s Line Items that require grand totals.

In step 2 and 3, we set the finish time of the report so the backend triggers can take over and the next record in the queue started, if any.

Step 4 (barely pictured) launches another backend workflow to generate a PDF of the report, that’s for another topic! If you’re curious though, I’m using PDFMonkey with some Javascript to generate the payload to send over API to PDF Monkey. The payload is created using the Report and it’s Line Items. If anyone wants more details on this, let me know and I can make another detailed post, but I’ll keep it out of here since it’s not relevant to this post

I’m putting all that there in case it helps anyone else, or if some of the brilliant minds around here have any suggestions to make this better or faster. I wouldn’t consider myself an expert by any means, just stubborn enough to spend far too long making things work! I’m sure there’s better approaches though that I haven’t though of or know enough to be aware of.

sem · December 25, 2022, 12:06am

@aiancu very interesting. Thanks for sharing!

I was wondering, why do you not use custome states? Perhaps it is quicker.

Topic		Replies	Views
Help Needed for Developing Reports Section in E-commerce App Need help	10	86	September 27, 2024
Having really hard time structuring Data Need help	6	1374	March 6, 2016
The Inception Challenge: Inventory App Database	1	1312	December 9, 2018
Bad app design: How would you do this better? App Organization	6	940	May 22, 2018
Need help with database structure for lightweight fintech app Database	0	296	July 2, 2020

Advice/guidance needed for report building

Related topics