New LLM streaming plugin!

Hi everyone,
I have just launched a new plugin for $55 once or $10/month that makes it super easy to connect to the major LLMs (GPT, Claude and Gemini (Grok coming in 2 weeks with others to follow)).

It supports the following:

  • Streaming with all LLMs
  • Function calling for GPT (now called tools)
  • GPT assistants with streaming
  • Webhook functionality
  • Protected API keys (if you are using a plugin that doesn’t protect them, anyone with a little bit of knowhow can very easily steal your API key)

There are a ton of other features that are explained in great detail on the demo page (link below).

The plugin demo page can be found here:
LLM connector demo (bubbleapps.io)

The editor can be found here:
LLM connector demo | Bubble Editor

The plugin page can be found here:
LLM with Streaming Plugin | Bubble

This came as a result of me needing a solution for all the apps I was building that all required LLM integrations. I wanted a plugin which had a bunch of features that would make things a lot easier. I continue to use the plugin in all my apps so even if there are no downloads, I will still be updating it for my own benefit.

If any of the instructions are unclear in the demo page, I will very responsive on the forums and be constantly updating the demo page to make it crystal clear.

Hope you guys enjoy.

Who owns and maintains the intermediate server in the middle?

Sorry for the slow reply.

It’s my own EC2 instance on an AWS server.

Please let me know if you have any other questions.

I will also be releasing a feature in the next couple of weeks that allows you to make the call directly to the LLM without the middle server if you aren’t concerned about API key theft, such as an admin page.

Hi Paul. What type of text display container would you recommend to take advantage of your streaming scrolling functionality?

On a different topic, I’ve had a couple of instances where the OpenAI Assistant has said that he was going to prepare a report (file retrieval or code interpretation) and has failed to come back. It’s probably the time-out that you mentioned yesterday. I will try to reproduce and send you console log.

Another consideration in my use case is a user coming back and reengaging with a previous thread sometime later. I assume this is going to cause problems with the token expiring?
In my current setup, I take the OpenAI response and use a plugin to parse Markdown to BB-code so it can display formatted properly in Buuble. I assume that such a workflow step would negate the streaming?

Hi @ruimluis7
Just put a text element inside a group element and make sure you check the option “allow scrolling when content overflows” on the group element. The id of the group element should match the id field set in the “Call LLM” action.

In terms of your other question, I would have to see some error logs to determine the issue. It’s one of three things:

  1. OpenAI assistants has some issues with file retrieval as per some forum threads on this topic where it responds saying “I don’t have teh file” when in fact it does. Not sure if this is related to your issue but it’s possible.
  2. The action is timing out but if it was, you would get a popup saying the bubble action timed out after 30 seconds
  3. Some other issue in the plugin that I could only diagnose with seeing your error logs.

Not really. If a user comes back a while later, you will just have to rerun the “Generate tokens” action again. This will create new tokens that can be used in the “Call LLM” action again.

In general, it’s good practice to just be calling the Generate tokens action every time a user asks a question regardless of whether they are asking their questions 5 seconds apart or 5 days apart.

Let me know if you still have further questions.

Hi @paul29 This is great. I’ve installed the plugin and it works really well. I was testing out the call cost functionality for streaming and it doesn’t seem to work. Am I doing something wrong?

Thanks. Glad it’s working out for you. The pricing on the streaming functionality isn’t supported natively by the LLM providers so I had to defer it as a feature. It’s getting completed this weekend along with a new feature for multi-modal capabilities with Gemini. It takes about a week for bubble to approve the updates so the will be available for upgrade by about the end of next week or early the following week. I’ll post here when it’s ready for an update.