New workload management tool: Infinite recursion protection

Hi,

I’m Steve, a lead product manager on the Scale team here at Bubble.

Today, we rounded up the tools and resources that can help you plan, track, and optimize your workload more effectively. Read the full summary on the blog here.

As a part of this update, we’re excited to share that infinite recursion protection is now available in all apps.

What is infinite recursion protection?

A powerful new feature that gives you more control over your app’s workload usage. This feature allows you to set an app-level limit on workflow “depth,” which is the number of times workflows schedule themselves (direct recursion) or another workflow (indirect recursion) consecutively. Any workflow that is scheduled beyond the maximum depth you set will be terminated automatically. While workflow-level constraints on scheduling remain the primary method for managing recursion, this feature creates a backstop for the entire app.

When a workflow is terminated as a result of this limit, it’s logged in the server logs under Workflow errors. Coming later this summer: automated email notifications for terminated workflows.

Getting started:

  1. Find the feature in the API subtab under Settings.
  2. If you’re not using recursive workflows, we recommend starting with the default limit of 10.
  3. If you are using recursive workflows, set a limit that accommodates your longest intentional recursive workflow.

Important dates:

  • Today, June 27: You can opt to enable this feature in all existing apps.
  • July 1:
    • All new Bubble apps will have this feature enabled with a default limit of 10.
    • Existing apps (created before July 1) won’t have any limit applied automatically, but you can enable any time.

Check out the article in the Manual for a deep dive and let us know if you have any questions or feedback!

– Steve

18 Likes

Today is THE workload unit update :rofl:

image

I can see the default limit of 10 causing issues with people being confusesd but the benefits probably outweigh the downsides there. There’s lots of now-outdated guidance here about how recursive workflows are the best way to approach large data manipulation, before SAWOL got upgraded so I’m sure we’ll see some people confused why their workflows stopped here, but better that than the ‘why did I cost 3 million WUs’ posts…

6 Likes

Shouldn’t the settings be on the Backend workflow themselves? Since one recursive workflow in an app can be vastly different from another, it would seem very likely that a user would want the flexibility to but a limit for recursive workflow A at say I don’t know 10,000 while on recursive workflow B they would maybe want that to be 100. This is also something to consider as to how vastly different one recursive backend workflow can be in regards to WU consumption as the actions that they may have could be so dramatically different from each other.

Any plans on putting this onto the backend workflow itself rather than app wide? Seems like the logical next step to get this feature to be as robust as a developer would want it to be.

17 Likes

I totally agree with @boston85719 It MUST be on the backend WF itself. I think it should also be possible to set it dynamically.

4 Likes

+1 on that…many use cases for that need

2 Likes

Fantastic!

Coming later this summer: automated email notifications for terminated workflows.

Even better!

Hi @boston85719 ,

This same question has come up internally. So thank you for raising this, as I think this is a good opportunity to share the thinking behind building this protection feature the way we did:

  1. Workflow-level limits are already supported with constraints on the scheduling action. Many cases of accidental spikes involve cases where there are no constraints put in place at all, so it’s not clear that adding another optional setting on the scheduling action or an individual workflow would be effective in preventing these spikes.

  2. Workflow-level limits would not be able to cover indirect recursion cases, where multiple workflows are involved in a recursive loop (often unintentionally).

  3. Providing a limit to control recursion on the workflow itself essentially becomes a language feature to better support iteration with recursion. But as we think about the best ways for enabling customers to iterate a set of actions over a list, recursion is definitely more of a workaround. We want to focus our efforts on providing simpler, more intuitive, safer, and more performant ways to implement this kind of logic in Bubble, which effectively means moving away from recursion completely. I understand this may be an unsatisfying answer because recursion is so widely used today (for good reasons in many cases), but I think it’s the right approach for our product and customers in the medium to long term.

3 Likes

I’m happy to read the first part… but not the last part. This need to be improved short term (before new WU plans become obligatory)

Very true, and as a user I feel it is the right decision to spend dev time elsewhere :slight_smile:

1 Like

this sounds good, and a nice starting point would be improving the schedule workflow ona a list: providing the index of the current item in the scheduled list would be great. there are workarounds now but they are very ugly.

9 Likes

@steven.harrington

“Which is the number of times workflows schedule themselves” during what time interval?

I say this because I have a workflow that reschedule itself several times during the day… this is not a problem for me. The problem will be if this particular workflow reschedule itself 50 times in one minute…

What time interval are we talking here?

2 Likes

Yes :heart_eyes:

Awesome new feature! - but as @rpetribu has pointed out these workflows also might be used to run events every hour, day or week that provide business value.

It would be more useful to provide the protection over a timescale (1h or 24h)

+1 on this, and I’d also say with a bit more protections around race conditions as well. Schedule Workflow on a List is so fast, but it’s a shame to have to use workarounds to avoid some of the issues that pop up due to its speed.

There is no limit on the time interval. If you’re using recursion as a workaround for recurring operations (which is fairly common) then there are a few options to consider when implementing your depth limit:

  1. Switch to using recurring events instead of scheduling workflows recursively, as these will not accumulate depth each time they run. If you want to run it more frequently than once per day, you can create multiple daily recurring events.

  2. Plan to manually reset your recursive workflow periodically, and set a depth limit that allows this to occur without interruption. For example, if your workflow runs 4x per day, you may consider setting a depth limit of 5000, and setting a calendar reminder to reset your workflow manually once per year (or even every 3 years).

  3. Set a depth limit thats so high you won’t have to worry about it. This may or may not be feasible, but if you take the situation in number 2 to the extreme, perhaps setting a depth limit of 50,000 would provide a sufficient level of protection while avoiding any issue with this recurring recursive workflow for the next 34 years or so.

None of these are necessarily perfect solutions, but having a limit of any kind in place protects you against the kinds of spikes we often see, when workflows run 10s or even 100s of millions of times unintentionally before being caught.

Thanks, @steven.harrington. One perhaps common recursion use case is bulk create via data API due to its limit of 1000 records. Is there new guidance or perhaps even forthcoming upgrades regarding how to approach bulk creation? Maybe the current default recommendation would be to split the entire batch of records to be created into subbatches of 1000 each, and send the list of subbatches via SAWOL to a workflow that calls the data API?

In explaining that “The maximum number of items that can be created via a single bulk request is currently 1,000” and that “There is also a limit of 4 minutes for the request to complete”, the manual article still references capacity multiple times: “The creation speed and risk of timeout depends on the available capacity that your application has”, “Caution: Bulk creation can consume a lot of application capacity”, and “if your app is being used in production and capacity is limited…”. So, it’s hard to know how to interpret the implications for an app on a workload unit plan and not a legacy capacity-based plan.

I hadn’t even thought of that, great catch. That’s such a shame - it means I can’t really use this feature as I never really use recurring events :frowning:

With the years of experience I have developing full time and having used recursive workflows since 2018, it is definitely not the case that since we have the ability to add constraints to make the recursive workflow, well recursive, and also, stop when it should, that it always happens. I still today have issues at times where a conditional placed onto the recursive component of the workflow may fail because of unforeseen issues at the time of development. These unforeseen issues at the time of development could always happen in any project, no matter the experience level of the developer, because, well, things happen.

One example of this, is that if I have a condition to run the recursion only if a certain value used in the conditional expression that references a data field on a data type that is part of the backend workflow, for whatever reason is not available, then the infinite loop can occur.

Some reasons for why this may happen

  1. Data integrity is loss due to factors outside of the developers control - something experienced recently by a lot of Bubble apps
  2. Failed webhook receipt - can happen if another developer on the project inadvertently changes the name of a backend workflow that is used by a webhook that is meant to set the value on the data field used in the conditional of the recursive backend workflow
  3. The data field used in the conditional is empty and evaluates in a false positive - In Bubble if the data field is empty and the developer doesn’t specify in the conditional that the field must not be empty as part of the conditional, then this leads to false positives in the evaluation
  4. Lots of other real world use cases that can only really be gleaned from real world experience building in Bubble and seeing how simple mistakes by a developer and/or issues within Bubble that could cause it so that the workflow-level limits imposed by the developer via the constraints will fail to stop the recursive workflow from running.

Definitely, especially since they plan to setup functions to make it so recursive workflows are no longer needed, but until that day comes, which from being around Bubble long enough, we know is likely many sunsets away, we should have had a feature that is enabling the limitations on the workflow level itself, rather than on the app as a whole.

How not? If this is to imply that I have a series of backend workflows like WF A that triggers WF B that triggers WF C which triggers WF A, then yes, a situation like that would warrant having an app based setting, but I don’t anticipate that the current app based setting would be detecting that situation as a recursion since it is not WF A triggering WF A…plus as building products that users love, we need to anticipate all users needs and potential use cases, which in this situation would be that the feature should have been on the app level itself AND on the backend workflow itself.

Would have loved to have had an announcement that announces those features rather than this…was it not considered a time sink to work on this announced feature if the team has the priority of providing a more simpler and intuitive, safer and more performant way to implement the type of logic for recursive workflows? Plus, doesn’t it seem safer to put the limit on recursion onto the workflow itself for all the reasons it should have been on the workflows rather than app level to ensure the developer has the ability to choose the appropriate limit for each workflow that is recursive rather than a broad based app level limit?

Don’t get me wrong, I appreciate the team looking toward ways to alleviate issues that users have with WUs, but seems like the time would have been better spent to make it a workflow based setting rather than app wide.

Yes, and even a simple operator of list is finished or this item is the last item in the list

Yes, and be placed on the workflow level itself instead of app wide since we likely have some recursive workflows that schedule themselves based on current date and time while others may scheduled themselves 2 days later.

yes, even if this leads to a slight decrease in performance, it would be better than data issues because just like I don’t speed while driving because I ultimately just want to get to my final destination alive and unharmed, the use of schedule backend workflow on a list should be focused on execute the functions and ensure data integrity and everything worked as expected, rather than speed of execution.

If manual interventions are required to get the feature to be as robust as users would need it to be, sounds like a feature that needs to be updated and implemented in a fashion that users would need it to be so they could make use of it…I know I wouldn’t want to have to receive emails or calls from unhappy clients 12 months after finishing a project because the reminder was not delivered to the client, or the client didn’t know how to change it or the client lost the video demonstration of how to change it etc. etc. etc.

I feel like we a drawing at straws to defend the feature. Why should we have to incur the WU costs of 49,998 unnecessary recursions just to get some level of protection from a feature that could be much much better and more useful to the developer?

I think this feature is more about recursive backend workflows rather than recurring events. So a recursive backend workflow being when we have a backend workflow that will schedule to run itself at the end of the action series to create a sort of loop. The recurring workflows are more about setting up a backend workflow that will run ‘once every x’ where it could be every day, month etc. These two types of backend workflows are materially different from each other and have very varied use cases between the two.

I think this one is really picking at straws.

2 Likes

@steven.harrington

Nice feature but again please, please, please consult experienced Bubble developers when developing features. Timescale and limit should be at workflow level.

2 Likes