Efficiently Bulk relate Two Data Types Uniquely

DanielH · November 15, 2024, 9:04pm

Hi,

You have thousands of Users (or any other Thing) in one table
You have tens of thousands of random Words in another table
You want to randomly assign each User one Word and you can only use each Word one time

How would you do this efficiently?

Assume the Word thing has a simple yes/no field to track whether it has been used or not.

I realize one option is to simply export the Users into a CSV file, assign the codes in Excel or similar, then reimport them. Just curious if there is a way to do it inside Bubble without using tens of thousands of WUs, as I can imagine the solution to this might be useful in other situations.

DanielH · November 20, 2024, 3:01pm

I saw this thread from adamhholmes that leads me to think using ‘Make Changes To A List Of Things’ is the way to go, but I’m not sure how to iterate through a long list of Users and have it to modify two things at the same time in each iteration (since both the User and the Word Thing need to be updated, else the Word may be accidentally used a second time later on).

Before seeing Adam’s post, I ran some tests using a few variations on a recursive ‘schedule API’ that all worked, but used something like 83 WUs per iteration! Obviously, this will not scale to thousands of individual Users/Words.

Made me learn the hard way how fiddly it is to both find and stop these large processes on the backend, across live and test. There should be more documentation on this (panic sets in pretty quick when your watching your WUs continue to count down after you think you’ve stopped the offending workflow!).

For context, here is the workflow that racked up huge WU usage very quickly–so definitely don’t do this.

Translating from the original post: “User” from the original post = “Company” in the screenshots and “Word” = “RandomValue”

The on-page button to start the workflow:

The backend workflow:

Hoping someone is willing to advise. Else, maybe this post helps someone else know what not to do.

DanielH · November 20, 2024, 10:46pm

Hi, thanks for your reply, and welcome to the forums!

They only way I’ve managed to get this task to assign the Words in a unique manner (meaning one Word is assigned to exactly one User) is by using a recursive workflow. This is the example I shared images of above–very WU inefficient. But I’m not sure exactly which part of that workflow is the inefficient part (and thus what an efficient alternative may be).

The order of the Words in the database is already random and each Word is unique–so returning any Word for any one User is fine. But when I make the API workflow a “Schedule API workflow on a list”, it is very efficient…but results in duplicates: the same Word being assigned to multiple Users.

DanielH · November 21, 2024, 3:35pm

Because it might be interesting/relevant for someone else, I did some small scale testing assigning 4 Users a Word each in two slightly different recursive workflows. It is a very small test, but its a real pain following the logs in Bubble (then resetting the data to try another way), so this is what we are going with.

In the first, I used the screenshots above, where the last step of the workflow, which schedules the next ‘loop,’ uses “minus item” to remove reduce the size of the two lists (Users and Words).

Reducing List Size each Loop
Initial WF: 1.43 (Button press and Search)
Then WUs per iteration decrease as list decreases in size:
3.03 → 3.00 → 2.97 → 2.93 + 1.2 to end overall (action condition failed).
= 14.56 total WUs

In another test, I instead used an “index” method, where the list stays the same size, but I increment an index value on each ‘loop’ (this is the method AirDev recommends in their documentation/guides).

Incrementing an Index each Loop
Initial WF: 1.43 (Button press and Search)
Then WUs per iteration are lower, but do not decrease:
2.93 → 2.93 → 2.93 + 2.23 (action condition failed)
= 12.45 total WUs

Interesting the index method is fewer overall WUs for the same result and that reducing the list size each loop doesn’t decrease linearly.

The decreasing loop numbers are close enough though, so perhaps its a random server variation and .03 is the additional cost of each item on the list. If that is the case, then a list of 1,000 items would add around 30 WUs to whatever the first loop costs to process on its own.

I may try passing the unique IDs of the Things to the backend WF instead of the Things themselves and see if that reduces the WU usage.

For completeness, the backend workflow being triggered:

Modifies 3 fields on the Word Thing
Modifies 1 field on the User Thing
Increments the index (index + 1)
Conditionally schedules the next loop while index < count (count = # of items, given by the frontend button that calls the workflow)

Topic		Replies	Views
API workflow to work on multiple lists APIs	1	104	May 13, 2024
Run mode import of data APIs	3	465	April 1, 2018
Randomly Match Users Together Weekly APIs	7	1065	March 27, 2019
[Feature Enhancement] Improved Scalability of Schedule API Workflow on a List New features	42	9269	May 1, 2024
How to create a Thing, multiple times? Need help	26	9407	August 4, 2020

Efficiently Bulk relate Two Data Types Uniquely

Related topics