Anyone got any tips for creating a queue of API workflows that run in sequence and not simultaneously? My current use case is using AI to generate sections of a document. The generation uses the content of the previous sections to prompt the generation of the current section.
Users can generate each section one by one, and they should be able to generate multiple at a time (so if they click generate on sections A, B, and C, section A will be generated, then section B, then section C, rather than just running the API workflow as soon as itâs scheduled.
A recursive workflow only really works when the user knows how many sections they want to generate when they kick it off - they canât come back two minutes later and add more section generations to the queue.
Interested to hear anyoneâs workarounds or solutions. I guess Iâm looking for a way to put an API workflow queue in a funnel so that only one can run at a time and the next one in the queue will start when the previous one finishes. Iâm thinking of something like a âQueueâ data type that is searched for and, if found, scheduled to run when the current API workflow is complete?
Does Bubble provide both the workflow ID and status of the workflow as a value or did you need to just capture the workflow ID and create your own status? Iâm wondering if Bubble is providing some kind of status value similar to a progress percentage.
@georgecollier you can do something similar to what you are thinking and what @redvivi said. I often use the term Processor for my data type that I use to track backend workflows. I would say for this use case you can have a data field on your Queue that is a list field of the backend workflows that are âto runâ and another field that is âcompletedâ plus a 3rd for âtotal workflowsâ, which the total workflows are all workflows triggered, the âto runâ has only those that are remaining, and the âcompletedâ has all that are finished, so that you can compare against the âtotal workflowsâ to ensure all were âcompletedâ, but the âto runâ will allow a user to continuously add more as they wish.
Thatâs not entirely true. While the recursive workflows youâre thinking of are more WU efficient because the iteration count is statically defined, you can have recursive workflows dynamically determine whether a next iteration is needed or not. For small lists, the WU difference is low and can be outweighed by the benefit of smarter recursion. This is exactly what you are talking about here:
The specific setup for dynamic recursion depends on your schema but mainly on how much you are allowing users to go berserk on queue management.
If you only want to let users add to a simple queue a few minutes later, then you donât need anything too fancy, you donât even need a new datatype; you could just iterate through a list in the datathing.
This would allow users to add to queue (add to the list) while the recursive workflow is iterating, as long as you also implement error-handling which reschedules the same iteration (or deletes queue) if something goes wrong. You also need a way to determine if a recursion is active on a particular datathing, so you know whether to initiate the recursive workflow or just add it to queue.
Lists are ordered and simple, but if you use them, the API workflow itself should not update the list, or you risk running into race conditions with the backend and frontend modifying the list simultaneosly. I would personally save the total iteration number on the datathing, and avoid removing completed iterations from the queue. You can also know use iteration = queue:count to determine whether the recursion has been completed. You could always reset both of these at a later date when the chances of the user being online and active are very low.
If you want a more complex queue, with status reporting/multi dependencies etc, then using a new datatype with individual entries for each queue item would be the best solution.
Edit: I was assuming youâre saving the response of the APIs to the database. But I now realise you might not be. In this case you could use Local storage to save which sections have been sent to be generated and determine from there whether a section can be sent for generation, or whether it has to wait for dependenciesâ response.
I am Thanks for the ideas guys, good to know Iâm down the right track and thereâs lots of good points I can consider when implementing. Will report back with any pros and cons once done.
I have a trigger set up so that when a Fileâs Parent File is changed, various other lists are updated + hierarchies calculated.
I have been adding a bulk move feature. That means that when I make changes to a list of Files to change their Parent Fileâs (i.e the folder theyâre in), all the triggers run virtually simultaneously and some lists break due to race conditions.
So, I had to solve: how can I slow down the rate my DB trigger runs such that the actions donât encounter race conditions?
High level overview: Have a queue data type, and a queue job option set. The queue job option set contains the amount of time that should be spaced between workflows of this type. In the trigger, schedule an API workflow to run at the Current date/time + (number of pending jobs * time to allow per job).
Queue Job data type is created when a trigger is ran, and before a workflow is scheduled. Creating a Queue Job means 'I want to do something ASAP, but for whatever reason, it needs to be slightly spaced out
Schedule an API workflow to run at a certain point in the future based on the number of pending (complete = no) items in the queue.
That schedule date expression gets a count for the number of pending queue items, multiplies that count by an option set attribute that says how many seconds we should allow per workflow of this type (I set 2 seconds for this option).
If I set the queueTimeSeconds in my option set to 5 seconds, and there are 10 pending queue items, itâll schedule to run in 50 seconds.
This isnât foolproof - if the workflows donât run on time for some reason, you could run into some issues, but with the WU plans even scheduling loads of WFs still means everything runs pretty on time.
A modified approach is closer to what @boston85719 suggested, where you schedule a workflow if and only if there are no pending jobs. Using this method, in the relevant backend workflow, you need to schedule the next job once itâs complete. So, from nothing in the queue to adding 5 things, it would go:
Schedule first API workflow (4 items remaining in queue)
First API workflow schedules second one once itâs done (3 items remaining in queue) etc etc recursively until thereâs nothing left in the queue.
Hope this helps.
Also, nobody pick on me for having both completed (yes/no) and completedDate (date) instead of just having completedDate and knowing that if completedDate is not empty it must also be completed. Itâs 1am
@hi_bubble here is a link to a forum post that indicates an approach when using the schedule backend workflow on a list function.
You can find other posts in that thread discussing the approach
The approach outlined in the posts allows you to run an action only after all items in the list have completed their processing, so you wouldnât need to use a recursive backend workflow costing lots of WUs to schedule itself, as well as the costs of carrying over the parameter (a list of things) for each run of the recursive backend workflow. It also when used on schedule backend workflow on a list will not require additional data types or scheduling backend workflows for each item in the list, as instead it only schedules a single backend workflow when all items have completed.
The posts linked are a discussion on how to avoid race conditions when using the schedule backend workflow on a list, while avoiding the WU costs of recursive backend workflows.
@georgecollier similar to both your approach and @boston85719 I have a dataset queueWorker with 1) Queue Job Option and 2) âis On?â y/n and thatâs it.
All the DB trigger has to do is check is the relevant queueWorker is on or not. There is only one QW per Queue Job Option (maybe also per user Company) so no records to create etc.
If it is on â do nothing. (the QW will find this record to process)
IF itâs off â CE to turn it on with the record that triggered the DB (helpful when there is often only one record to modify so cuts down on search)
A big advantage of this approach is the ability to stop all QWs that the current QW is a prereq for (first step of the QW to check if any of the QWs that use current QW as a prereq are on).