Looping Is Creating Duplicate Records

I’ve been seeing a new concern in a workflow that I thought I had resolved in the past; hoping this audience can help me identify where I need to change my implementation…

In short, I am importing data from a spreadsheet and my import process is attempting to normalize the data stored in the database. For example, if there is a “john smith” listed multiple times in the spreadsheet, I only want one “john smith” created and then related records associated to this one “john smith” recipient record.

I built the process sometime back and thought I had a solid running process (i.e. my logic works) in the past, however, my recent tests are not consistent; feels like something changed with execution handling of workflows in bubble or the number of records in our table is causing slower queries for conditional checks.

The first screenshot is the workflow that triggers the process; in workflow the concerning bit is occurring in “Trigger GetRecipient” custom event. When called, the Get Recipient Event (screenshot further down) should return a matching record if it exist or create a new recipient and return the record to the workflow.

When running through a loop of records in an import process, I will see that sometimes (50’ish %)this process works perfectly and other times the process is creating duplicate recipient records.

Is there a better way to implement a “create a new record, only when a match isn’t found” in a looping process? I thought by using Custom Events" for the record creation, Bubble would not have sequencing issues (i.e. parallel request processing vs sequential)

This is the custom event… The search logic in the conditions are correct and work as expected (when they work)

Hi @nilsobrien

Can you trigger the flow and document your logs. I’ve had issues with searches originating in the backend in recent days but I managed to resolve it.

My issue looked something like the attached. As you can see there seems to be a random ‘_LOOKUP_’ text embedded in the searched values unique_id which creates a mismatch. Still not sure what causes that because its not privacy rules, so although I’ve resolved the issue I’m still waiting on Bubble specialised team to explain this because the product specialist couldn’t.

Thank you @QinisoG … I see that in my logs, but those don’t seem to be related to the areas of operations that my process is having an issue with.

I just reviewed logs again and I agree. But as I said I had a fix, but I can’t explain why the fixed worked which is why I’m still waiting on Bubble.

Can you show me the fields you use to constrain the search for the records that get duplicated, or are there no constraints inside of the search?
Do you have multiple instances of ‘create a new thing’ for the duplicated records and if so where are those actions?

Its not entirely clear to me whether its a slow/mismatched search or whether you are unknowingly triggering multiple instances of the same action.

These are the constraints used for my ‘recipient’ record. I’ve adjusted the trigger to be called one time, vs triggered by changes to the table which I suspected was contributing to the problem… However, I’m still seeing the inconsistent results.

What’s in the custom event GetTract, because I see step 4 is essentially the same action (with the same conditional) as step 1 for custom event GetRecipient but its not inside of the GetTract custom event as one would expect.

I can’t really provide meaningful input unless I have enough context.

Triggering a custom event in a workflow won’t make the workflow wait for an independent workflow that is already asynchronous. Database trigger events are run asynchronously (in parallel), as implied by the docs in Database trigger event | Bubble Docs

Note: If you have more than 20 database triggers that kick off at once, the remaining triggers will be scheduled to protect your app’s infrastructure from consuming too much memory.

From here I assume the issue is caused by two or more workflows at nearly the same time triggering GetRecipient custom event, and both have no results when searching for an existing recipient.

I can think of two approaches to a solution:

A. Have the “create recipient” step be done by a singleton , i.e. only one process running at a time. This is kind of hard to do in Bubble backend, usually with the help of an external database. I won’t go into more details as I think option B is better.

B. Have the “create recipient” step be done by asynchronous workflows as now, followed by scheduling another backend workflow to fix the duplicates.

The tricks with this fixup workflow are:

  • Delay the workflow long enough to have the search definitely find at least any recipient created before the one in this workflow. Any created later will be fixed by the later fixup.
  • Structure the fixup so that it doesn’t matter if it is run multiple times.

An example fixup workflow could be:

  1. Search for recipients matching the criteria, order by created date. We want this to match all duplicates.
  2. Search for data linked to recipients in list from step 1’s recipients, skipping the first recipient from 1.
  3. Alter the data to link to the first recipient from 1.
  4. Delete all but the first recipient from 1.

Edit - steps 2 to 4 not needed if step 1 list is only one recipient.

Another approach is to sort the data by recipient, then split into two spreadsheets, the first containing the first records for each recipient, the second spreadsheet containing subsequent records.

Then delay the import of the second spreadsheet until after the first has processed and all recipients have been created.

Have you tried slowing the speed of the loop. So instead of having an empty interval, I would set it to 1/2 seconds.

Bubble has historically not been very reliable when handling changes to lists, which is why one generally has to have an action that checks the integrity of the changes. Slowing down the speed of the looping process in my experience has made the changes made to lists more reliable.

Another thing to consider is the fact that you have 7 constraints on your search, Bubble often misses a couple of fields when making changes to a list. Having 7 constraints on your search means that if for example you already created a particular muOfferRecipient before, but Bubble failed to accurately modify even one field that you use as a constraint, the search will come back empty(despite ignore empty constraint being check because this is ultimately a check on the parameter values). Therefore a duplicate record will get created not because that record does not exist but because one field in your constraint did not get modified correctly the first time around.

So the inconsistent results could be as a result of occasional inaccurate data capture.

I would limit the search constraints to just one or two fields that are expected to hold unique values if possible, as a bad example using the recipient street address. If you can modify your spreadsheet and add 1 column/field, that will hold a unique field and use that as your search constraint in isolation while running a slower loop (if possible), you should manage to fix your issue.

But again, if I have a little more information I can probably narrow down the issue more :slight_smile: . If you uncomfortable sharing too much publicly, you can send me a dm.