TL;DR: Scheduling recursively workflows - More reliable, slower. Schedule API Workflow on List - Less reliable, quicker on short lists.
Yeah - As a long time bubbler, I’ve never learned to trust API workflows on a list. I feel the pain here!!
What I did that helped stop these incomplete lists: Once we gained the ability to loop/recursively schedule an API Workflow (inside of the Schedule API Workflow, you can schedule the same Workflow and feed it the same parameters), I’ve used that - And that has never failed me. It seems slower (not scientific)… But it appears to be guaranteed!
[New Feature] Scheduling API workflows can now be done recursively ← Great resource that goes into detail.
MY TAKE (opinion based on my experience) - May or may not be correct: I’ve put arrows on what I thought are the most important pieces - Schedule API Workflow on List appears very reliant on capacity and is sensitive to capacity spikes (plenty of forum posts on this) - So if you have a spike in capacity, it’s possible it will cause data inconsistencies. It’ll simply stop or restart itself based on capacity potentially leaving gaps in data. Through the years, from the forum posts I’ve seen, I’d wager it’s these spikes in capacity causing these data problems. (Expected response: “Well, if I look at my capacity chart, it shows plenty of capacity!” - My response: “It’s difficult to determine a spike since the capacity chart isn’t detailed enough… And I don’t trust the message saying how often my app exceeded capacity. Case and point, view the chart over past 6 hours, then do it over the past hour, you’ll see some different %'s…”)
There’s more on this topic, but it gets boring fast so I’ll leave it there
Overall, try recursively scheduling API Workflows instead of Schedule API Workflow on List… It may work for you as well as it did for me - After all, data integrity is the most important thing even at the potential sacrifice of speed!
Hopefully this helps.