So this is a four step recursive workflow.

Step 1 is to create the retailer-listing by taking the data from another data type. At the end of this workflow it triggers the next step with a 2 second waiting period.

Then step 2 with a one second wait to trigger the next step…I figured the need to wait a second or so is not completely necessary as the trigger shouldn’t be fired until the first event in the workflow is complete, but also the next step doesn’t rely on any finished data entries from this step.

3rd Step

And finally the fourth step which the final event in this workflow is the recursive portion triggering the first API in the series

This fourth step in the series is actually the last workflow that started running at 1:04:55AM which would lead me to believe the final event to trigger the first API in the series did not occur for some reason.

What would a manual error checking workflow setup entail? I could only imagine that the manual error checking workflow would fail at the same time, but I really don’t have a clue what the set up of such a thing would be like.

But what @vini_brito mentioned of a auxiliary workflow seems like a dangerous ‘backup’ plan as I could imagine it would cause data redundancy and would also still be susceptible to these types of failures as well.

It just really makes me nervous about trying to create a business that relies on data integrity, where things like invoices, inventory updates etc etc could fail and have no real understanding of why the failures take place…especially when they are not able to be reproduced and therefore not supported by Bubble support.

As @eve mentioned

Before, posting this, as I stated already I started the recursive workflow again…at this point the total number of entries has 4,493 — so it is pretty clear I can’t even reproduce the failure, and with no indication from the server logs of a possible cause, it seems impossible to figure out.

The same situation was true of my first post on this subject a couple of months ago. I had the failure during the night…came back the next day, ran the recursive workflow again and it worked.

So these failures seem to be those intermittent, non-reproducible and unexplainable failures that are dreadful as I would never really be able to mitigate them…unless going to a dangerous backup plan of auxiliary workflows - but really shouldn’t need to have a ‘duplicate’ workflow in case of failure.

Even so, I guess I will file a bug report as seems like that is the only option to hope for an explanation and resolution to avoid this in the future.