So last night I set up a recursive workflow to run on over 5,000 entries in development mode…The workflow was set up correctly because before I logged off the computer, the workflow had ran successfully around 600 times.
I fully expected to log in today and see if not all of the entries completed, at least a large chunk…instead I see only 617 were completed.
So, I needed to investigate in the server logs to see what would cause this failure.
This is the last registered workflow that started running at 1:04:55AM
So, looking at the capacity usage, because of course, if I was maxing out capacity that would be an obvious reason for the recursive workflows to fail
But looking at that chart it clearly shows that at the time the last workflow was triggered (roughly 1:05AM)…there were no max_out_times (0) between 1AM and 1:15AM
Checking out other graphs in the server logs makes things even more confusing as to why did this recursive workflow fail
Could it be that for some reason the number of page views was so ridiculously high that something occurred?
No, because there were 0 page views between 1AM and 1:15AM
Checking out workflow runs you can see things were going well, then there was a drop off.
Why such a dramatic decrease? Shouldn’t the recursive workflow run at the same interval, so that there are a similar number of workflow runs over the period of time the recursive workflow is running?
What other goodies do we get in the server logs to help explain issues like this? The server capacity would be a good place to see how my usage, although not maxing out capacity, may help explain it.
Okay, so usage against available capacity of 11.44% at 1AM ---- could that really cause the issue; I hope not, only using 11.44% of server capacity shouldn’t cause the app to fail. Then at 1:15AM it was 0.04% — because by that time the failure had already taken place.
So, I figured there must have been something wrong with my data…because something off with the data may have been reason for the failure.
When I logged in today the number of entries stood at 617. Right before I began writing this post I started the recursive workflow again…now the number of entries is 748…So, no problem with my data.
So, What caused this failure?
Could the main cluster of had an issue of downtime…that must be it
Oh, no downtime recorded…so What? What would be the reason that this recursive workflow failed to continue?
@eve any ideas of who could explain this? I asked in the past on a similar thread and got no response.
@DavidS is there any particular department at Bubble with members who could help explain these types of issues, or look into them? I am sure it is not really a bug that could be ‘reproduced’ so I feel like the normal bug report would not be a worthwhile channel to report to.
@allenyang I’ve seen that you have posted in the forum about performance issues, is this type of issue something you may be able to shed some light on?
I’ve experienced enough times to be very concerned about it.