I realise there’s a topic with a near-identical problem but it was a couple of years ago and is still unresolved so I’m trying again.
I’ve built a recursive workflow that loops through a list of (400,000) items and updates each one. Each workflow performs one search to identify the item to be updated, and one search to find the data required to update it.
However, the workflow stops running at a seemingly random iteration. I’ve ruled out any faults in my logic. Checking the capacity graph, there’s nothing to suggest I’m maxing out. The logs don’t show an error of any sort.
The first time it stopped on iteration 3000+. Most recently, it stopped on iteration 23.
to me, it looks like it’s “timing out”, maybe the workflow is too heavy to it to finish without a problem… it doesn’t really need to reach max capacity.
“The general rule is, change list of things is fast + good for relatively small (say, 100 items or less) lists. API workflows are a little slower to get through, but they are more scalable and reliable for large lists… if you have a change list of things operating on too big a list, it might time out / fail to finish and break the workflow.”
Email support and they can look into your specific situation with respect to capacity. They will ask you when you are running the workflow and specifics around the scenario to try and narrow down the issue. The graph should indicate to you but we have found that it doesn’t always happen that way.
Note that you are not the first person to run into this particular issue with workflows dying without any message or indication and with no hint that capacity is the problem. We have had to rework workflows a number of times due to this problem.
I would suggest first emailing support and see if they can assist and if not then rework the workflow.
Probably not relevant, but I had a similar issue which was due to my conditionals. It kept calling the same two records as they failed for the particular condition. Once I fixed the actual records, which I found by refreshing the schedule log, things worked normally.
Oh, I forgot to tell you, I also had this problem (my scheduled API was timing out), so I changed the order of the steps, I put the “schedule API workflow” as the step 1, followed by the others… So even if it timed out, it would still keep working