LLM streaming in repeating group

Hi,

Has anyone figured out how to implement LLM streaming in RG for a chatbot use case?

I’d like to display the partial response as it’s streaming in last message, and then replace it with the final version saved in the DB once it’s completed. But thats not the way…

Thanks for any advice!

Michal

When it was first released my approach was to stream the response directly to the page then once the stream finished → hide that and display the same message from the RG or database. It works but as you mentioned it’s not ideal..

The core issue seems to be that the data is streamed via a WebSocket from an external API (like OpenAI), whereas you’re trying to stream it from a RG. RGs don’t poll data continuously in the same way, which complicates things.

Also, I imagine the streamed data is saved only after the full response is received (otherwise it would be constantly saving and WU +++🚀). If that’s the case, then there’s technically nothing to stream from the RG during the process (it only becomes available once the stream is complete).

Hopefully someone else has figured out a cleaner approach to this.

1 Like

This is how I approached it with my workflow after the user hits submit on the chat. First thing you need is a group on the page to act as the “Stream Handler” with a data type of “Text Stream”

  1. "Create a new message" or however you’re populating the RG (You might need additional workflows here to populate the RG with this message, depends on how your DB is set up)
  2. I call the OpenAI API
  3. “Display data in a group/popup” with the “Stream Handler” as the group populating it with Result of Step 2's current Stream (this makes it so that the stream handler always contains the current stream of text)
  4. Make a change to a Message (Step 1) with Result of Step 2's completed Stream (What this does is basically write to the DB with the completed stream)

Now all the data streamed back will populate that element which we can retrieve from in our RG.

Finally, within our conditions for the last cell in the RG, we can conditionally populate it’s contents based on if the "Stream Handler’s Stream is Streaming or the “Sream Handler’s Stream is not Streaming”

If it is streaming, we choose the “Stream Handler” as the datasource, otherwise we choose the “Current cell’s message” or however you set up your message objects.

1 Like

Thank you so much!

For these, who are bad in english (like me) here is re-written guide from chatgpt:

  • User clicks “Send”
    As soon as the user submits their question or prompt, you kick off two things at once:
  1. You create a new message record in your database (so you have a spot to save the final answer later).
  2. You start calling the OpenAI API with streaming turned on.
  • Set up your “Stream Handler”
    On your page, add a hidden or off-screen group (let’s call it the Stream Handler). Give it a custom data type called Text Stream. Its job is simply to hold whatever bits of text the API spits out as they arrive.
  • Populate the Repeating Group (RG)
    You probably already have a Repeating Group showing your chat history. Make sure that when you “create a new message” in step 1, that blank message appears immediately in the RG (even though its text is still empty).
  • Feed partial text into the Stream Handler
    As each little chunk of text comes back from the API:
  • Use a workflow action like “Display data in a group”, targeting your Stream Handler.
  • Set its data to Result of Step 2’s Current Stream.
    This means the Stream Handler always holds exactly what’s arrived so far—and nothing more.
  • Show the partial response on-screen
    Inside your last RG cell (the one for the new message), put a text element whose data source is conditional:
  • If the Stream Handler’s is streaming flag is TRUE, show Stream Handler’s text.
  • Otherwise, show that message’s saved text (from the database).
  • Once streaming completes, save the full answer
    When the API finishes:
  • In your workflow, do “Make a change to a thing”, targeting the message you created in step 1.
  • Set its text field to Result of Step 2’s Completed Stream (i.e. the entire answer).
    Now the full answer is safely stored in your database.
  • Switch from streaming to saved text
    After you save the full answer, the Stream Handler’s is streaming flag turns OFF. That automatically flips your RG cell’s condition: it stops showing the temporary stream text and instead shows the final, saved message text.
1 Like

In case it is helpful here is my approach

  1. API Configuration

    • Set up your API connector and initialize it to integrate with OpenAI or similar AI services. Ensure you format the input text as JSON-safe to avoid syntax errors.
  2. Workflow Setup

    • Create a workflow for user interaction that will handle sending messages and receiving AI responses:
      • Initiate Message: On a user action (e.g., clicking send), create a user message in the database.
      • Create Assistant Response: Create an empty assistant response in the database with a yes/no field for tracking if streaming is in progress.
  3. Send Request

    • Send the formatted request to OpenAI with the user messages included. Use the API connector to handle this.
  4. Temporary Text Storage Group

    • Set up a temporary group (you can name it text_stream) on the page to hold the incoming text from the AI. This temporary group will display the stream before saving it to the database.
  5. Display Streamed Data

    • In the workflow:
      • Assign incoming stream data to the temporary group.
      • Continuously update this group as data streams in.
  6. Save Streaming Data

    • At the end of the streaming:
      • Save the full text to the database.
      • Update the streaming field to “no”.
  7. Conditional Display Logic

    • Set up a repeating group to show messages:
      • Default state: Show text saved in the database.
      • Streaming state: When streaming is “yes”, show text from the temporary group.
1 Like

Thank you everyone, this sheds some light on the handling the streaming response:

I’m curious if anyone has tried to implement the streaming using backend workflows. Is this even possible?

I believe this is covered in Bubble’s docs regarding streaming. I’m not sure why it would be useful as streaming is not writing each character in turn to the database, so based on the docs I think streaming in the backend workflow will appear to the user as non-streamed content.

I have further question regarding this, i got the streaming and everything to work, but when in the workflow display data happens the whole repeating group scrolls to the top. For a chat this is really annoying. As you can see my workflow is a bit messy, but it works in this order:

  1. Make a new message
  2. Scroll to the last message (else the user always has to scroll, if you have a solution for this I would appreciate it)
  3. Embeddings of message
  4. Data search pinecone
  5. Set streaming to yes, so my message doesnt start streaming to early and show old stream data
  6. Request chatgpt, including old messages for history
  7. display data (here it refreshes for some reason)
  8. Markdown to BB-Code so its looks pretty (cant get chatgpt to write in BB itself, just ignores me for some reason)
  9. Make changes to step 1, set streaming no and take answer from bb code
  10. create another message that is the response from chatgpt which is hidden but important for history

Got it to work 1min after I send this in despair, just used the plugin to reverse repeating group. Not sure if this is optimal still, seems like a lot of wasted WU… hopefully Bubble makes this easier somehow

1 Like