A while back I posted about how to build a multi-step AI agent in Bubble. That’s useful, but what’s more useful, I think, is how you can land on that solution.
It’s about how to think when you hit a complex problem in Bubble – especially when Bubble doesn’t quite give you what you want. This is a skill we look for in our team, and also something that’s hard to teach and learn.
I’ll use a multi-step AI agent as the running example, but the goal of this post is to show a repeatable way of reasoning about complex Bubble problems:
-
Start from non‑negotiable requirements.
-
Translate them into hard constraints in Bubble.
-
Let those constraints define the “ground rules” of your solution.
-
Make early decisions that keep things modular and easy to extend later.
1. Start from non‑negotiable requirements
Ok, we’re going to build a multi-step AI agent! Here’s the goal:
For our agent, let’s say the functional requirements are:
-
The logic is reliable (doesn’t silently break).
-
It can be run in the background (no dependence on a user’s browser).
-
It’s modular (each part can be changed without touching everything else).
-
It’s maintainable (future-you or teammates can reason about it).
Most people jump straight to “which plugin / which API call / how do I loop?”
Instead: treat those four bullets as hard constraints and ask:
“Given these constraints, what can’t I do in Bubble?”
Answering that question narrows the solution space very quickly.
2. Turn requirements into Bubble constraints
Let’s translate the above into concrete Bubble “rules” that shape the architecture.
a. “Can be run in the background” → must be backend solution
If the agent should work when:
-
the tab is closed
-
the workflow might take minutes
…then front-end workflows are disqualified.
That immediately gives us our first ground rule:
Rule 1: The agent loop must live in backend workflows, not the page.
This is also important for maintainability. Most Bubble AI agents are crippled by relying on front-end plugins. How are they going to implement background AI agents that can be scheduled, or run in parallel? They can’t.
b. “Multi-step” and agent-controlled conversation length → must be a loop
A multi-step agent:
-
decides when it’s done
-
at each step, can either:
-
return a final answer, or
-
call a tool and then continue
-
That’s logically a loop:
-
Call AI with conversation context + tools.
-
If it’s done → stop.
-
If it wants a tool → run tool → go back to step 1.
In Bubble, a “loop” in the backend usually means:
Rule 2: Use a recursive backend workflow (a workflow that re-schedules itself) for the agent.
Now we know we’re building:
-
a backend workflow that:
-
calls the model
-
decides whether to stop or re-schedule itself
-
-
and some way to store the Scheduled workflow ID so we can cancel it (more on that later).
c. “All messages belong to a conversation” → data model
You’ll often feel stuck until you pin down the data model implied by the behavior.
Here, the behavior is:
-
There is a conversation (thread).
-
That conversation has messages (user, assistant, tools, etc.).
-
The agent’s logic always runs within the context of a conversation.
So:
Rule 3: We need at least two core types:
-
Conversation -
Message(linked to a Conversation)
Everything else (tool calls, models, etc.) can be layered on later, but these two are non‑negotiable.
This thought pattern generalizes:
“What are the persistent nouns my behavior keeps referring to? What kinds of things (ha) am I dealing with?”
→ That’s your data model.
3. Make tools extensible and modular, not rigid
We don’t just want tools that work once; we want tools that:
-
can be filtered by context,
-
are easy to add without rewriting a giant workflow,
-
and are optional based on conversation state.
If we just hard-code tools in a single massive workflow with tons of conditions, we’re locking ourselves into pain later.
Instead:
Rule 4: Represent tools as an option set (or similar configuration), not a pile of “Only when this tool name” conditionals scattered everywhere.
For example, an AI Tool option set:
-
name(what the AI calls) -
schema(JSON tool definition) -
isEnabledInContextXflags, or: -
category,requires_auth, etc.
This lets you:
-
filter tools passed to the model based on conversation properties, user roles, etc by passing the tool schema in the API call to your AI provider
-
add a new tool by:
-
adding one option
-
adding one corresponding custom event
-
No existing tool logic needs to be touched.
4. Isolate tool execution logic into modular “functions”
When the AI chooses a tool, what actually happens?
Conceptually, each tool execution is:
-
an isolated function:
-
input: arguments from the AI (+ maybe conversation/message context)
-
logic: does something, or reads something
-
output: a result (string / object) to feed back into the AI loop
-
This pushes us to:
Rule 5: Each tool’s logic should live in its own custom event / backend workflow.
So we get something like:
-
A router workflow: “Use Tool”
-
For each tool:
- a dedicated custom event:
Run Query Knowledgebase
Send Email
Create Ticket
etc.
- a dedicated custom event:
Why this matters:
-
You can reason about each tool in isolation.
-
Bugs in one tool don’t contaminate others.
-
Adding or editing a tool doesn’t require editing some fragile 200-step workflow.
Again, this is a general design principle:
Push variability into small, composable units instead of large, long workflows.
We end up with this:
5. Work with Bubble’s limitations (example: JSON)
Bubble doesn’t have a native “string → JSON object” feature in workflows.
For tools, that matters because:
-
The AI returns tool arguments as a JSON string.
-
Each tool has a different schema.
Therefore:
Rule 6: Accept that Bubble can’t parse tool-argument JSON natively, and design for it.
One clean approach:
-
For each tool, define a dedicated API Connector call to your own backend:
-
input: raw JSON string of arguments
-
output: parsed JSON in a stable shape Bubble can work with (result of step X’s {api call result}
-
The general pattern here is to find the hard limitation (here: JSON parsing) and accept it early.
Wrap it in a consistent abstraction (here: per-tool API call) instead of hacking around it all the time.
![image5]
![image6]
6. Design the UX behavior into the architecture
Another requirement:
“We need to show a loading/‘thinking’ message before the reply is generated.”
If you bolt that on later, you’ll end up with confusing conditions everywhere.
In addition, it needs to show quickly. We don’t want a user message to wait a few seconds before appearing in the UI. Therefore, the message creation must be in the front-end, and then we schedule the agentic loop (note that we can also make our logic for kicking off the loop support both message creation in front-end, and in the agentic loop when kicking off background agents).
Therefore:
Rule 7: Create the user and assistant messages together upfront in the workflow, in the front-end.
So the sequence becomes:
-
User sends a message.
-
Create user message and assistant message (
isGenerating = yes) -
Frontend immediately has a “loading” assistant bubble to display in the last repeating group cell for this conversation’s messages.
-
Once the AI responds or the tool completes:
-
we fill in the assistant message content
-
set
isGenerating = no.
-
Now the UX requirement is a first-class part of the architecture, not an afterthought.
7. Control and cancellation: don’t forget lifecycle
We also want:
“We need to be able to stop the agent.”
That forces another design decision most people forget at first:
Rule 8: Store the Scheduled workflow ID for the recursive agent in the database (e.g. on Conversation).
Why?
-
Each time the agent re-schedules itself, you save that scheduled ID.
-
A “Stop agent” action can then:
-
cancel the scheduled workflow using that ID,
-
clear or flag the conversation as “stopped.”
-
Again, the thinking pattern is that avery long-running process should have:
-
a way to start
-
a way to observe state
-
a way to stop
8. Future change: async tools with user input – was our design good?
Now, suppose a client later asks:
“Sometimes the agent should ask the user follow-up questions and wait for the answers before continuing.”
This is a classic test of your architecture.
Do we need to shove more conditions into our existing agent loop? Rewrite half the logic?
If we’ve followed the rules above, the change is surprisingly straight forward.
We can:
-
Add a field/attribute to the
AI Tooloption set:isAsynchronous(yes/no)
-
For a tool like “ask_structured_questions”:
- mark
isAsynchronous = yes
- mark
-
In the tool router :
-
when a tool is async:
-
create the questions as messages / UI elements
-
do not immediately reschedule the agent loop
-
-
-
When the user finishes answering and clicks “Complete”:
-
we treat that as the “tool result” being ready
-
then re‑kick the agent loop with updated context
-
Note what we didn’t have to change:
-
the core recursive loop structure
-
the notion that each tool is a discrete “function”
-
the message/conversation data model
All because we made earlier decisions that:
-
centralized tool configuration (option set),
-
isolated tool logic (per-tool custom events),
-
and respected Bubble’s backend + scheduling model.
This is the payoff of designing with constraints + modularity in mind.
9. A mental checklist you can reuse for other problems
You can apply this thinking style to almost any non-trivial Bubble feature.
When you’re stuck, walk through:
-
Requirements. Are there any hard requirements which will constrain how you do something in Bubble?
-
Modularity and variation. Is there any logic that’s similar that should be consolidated? What might grow in number over time (tools, actions, role?)
-
Platform limits. Does Bubble make something hard? If it does, how can I simplify it or make it a consistent pattern so that it’s not a pain in the ass every time?
The multi-step AI agent is just one example, but the same thought process works for any complex Bubble problem.
And, courtesy of Nano Banana:













