🤖 ᴺᴱᵂ ᴾᴸᵁᴳᴵᴺ Claude AI - ChatGPT-Like Streaming with Your App Data [Native RAG] - ( Prompt+Model Hiding, Web Search, Vision, Tools (Functions), Token Usage & Markdown + LaTeX, Prompt Caching) [Keeps your keys secure]

Hey Bubblers!

Claude is a family of foundational AI models that can be used in a variety of applications.

You can talk directly with Claude to brainstorm ideas, analyze images, create and process long documents.

Claude can help with use cases including summarization, search, creative and collaborative writing, Q&A, coding, and more.

Early customers report that Claude is much less likely to produce harmful outputs, easier to converse with, and more steerable - so you can get your desired output with less effort. Claude can also take direction on personality, tone, and behavior.

This plugin uses an external service to provide streaming capabilities.

You can test out our Claude AI - ChatGPT-Like Streaming Plugin with the live demo.

IMG_8938

Enjoy !
Made with :black_heart: by wise:able
Discover our other Artificial Intelligence-based Plugins .

2 Likes

Hey guys!

Plugin has been updated to support functions calling through streaming

Enjoy!

Hey Bubblers!

Informing you that the plugin has been updated with multi-image upload in user prompt capability.

Enjoy!

Hey Bubblers!

Informing you that of couse, Claude 3.5 Sonnet is supported out of the box :wink:

Enjoy!

Hey Bubblers!

Informing you that Token Usage statistics are now available.

Enjoy!

Hey Bubblers!

This plugin has been updated with LaTeX support

Enjoy!

1 Like

Would be interested in this as a stand-alone plugin (i.e render LaTeX from text data)

Hey Bubblers!

Informing you that Prompt Caching has been added to this plugin.
Prompt caching allows reducing costs by up to 90% and latency by up to 85% for long prompts:

Use case Latency w/o caching (time to first token) Latency w/ caching (time to first token) Cost reduction
Chat with a book (100,000 token cached prompt) [1] 11.5s 2.4s (-79%) -90%
Many-shot prompting (10,000 token prompt) [1] 1.6s 1.1s (-31%) -86%
Multi-turn conversation (10-turn convo with a long system prompt) [2] ~10s ~2.5s (-75%) -53%

Enjoy!

Hey Bubblers!

Informing you that this plugin now supports model and prompt hiding from your users, so you may keep your sauce of your solution secret :wink:

See

Enjoy!

Hey guys,

The latest update supercharge your app by letting Claude automatically pull relevant data from your app database with respect to privacy rules to enhance responses (RAG, a.k.a Retrieval-Augmented Generation). No more manual searching—just seamless AI-powered insights!

:white_check_mark: Step 1: Set App Data Retrieval on the element

:white_check_mark: Step 2: Enable your App Data API for the desired fields
image

:tada: That’s it! Claude will now pull real-time data from your app whenever needed—making responses smarter, faster, and more accurate.

:warning: Remember: Set up your privacy rules to keep everything secure!

Let me know if you have any issues.

Enjoy!

1 Like

Hey Bubblers!

PDF reading/vision support has been added for models supporting this file type, such as Haiku from today :wink:

Enjoy!

Hey Bubblers!

Please note that this plugin has been updated with Web :globe_with_meridians: Search

Enjoy!

Hi @redvivi

I’m planning to use your Claude AI - Chat Streaming with RAG plugin in the application I’m working on. Btw, the plugin works very well. However, during testing I noticed that this plugin uses the bundle.js library, which is loaded during page load.

Due to the fact that this library weighs 1.2MB, the page load is longer by approximately 300ms, which significantly worsens the already poor Bubble page load performance. Obviously. At first, I thought that this library was only downloaded on pages that use your plugin’s functionality (pages that have the ClaudeAI-ChatStreamingwithRAG element embedded and it’s visible, not hidden). However, this is not the case. All pages suffer from longer page load times - even if they don’t use the plugin.

Additionally, on pages where the ClaudeAI-ChatStreamingwithRAG element is embedded, in addition to downloading bundle.js, pdf.min.js is also downloaded. Here, the page load is longer by another approximately 200ms.

To summarize:

  • All pages: page load increased by 300ms

  • Pages with embedded ClaudeAI-ChatStreamingwithRAG element: 500ms (bundle.js - 300ms, pdf.min.js - 200ms)

Is there any chance you could modify the plugin so that it loads external JS libraries only when the ClaudeAI-ChatStreamingwithRAG element is visible (Element visible - true) on the page?

Hey @marpas

Thanks for the detailed performance analysis — that’s useful feedback which I have already been aware of.

The plugin does load bundle.js on all pages and pdf.min.js on pages where the ClaudeAI element is present. These resources are loaded asynchronously, which means they do not block HTML parsing or the initial render in the same way synchronous scripts do. The UI can paint while these files download, so the user can already see and interact with the page.

This significantly mitigates the impact on perceived user experience. In addition, once these files are cached on the user’s device, subsequent page loads incur minimal overhead because the browser retrieves them directly from cache rather than over the network.
That said, asynchronous loading does not eliminate their cost entirely. The scripts still add network transfer time on first load, and once downloaded, their parsing and execution occur on the main thread, which is why they appear in load-time measurements.

Regarding the loading strategy, I did explore an on-demand approach during development with the intention of loading these libraries only when the element was visible or actively used. In practice, this led to several issues.
These libraries expose global symbols that the plugin depends on, and when they were loaded dynamically, there was no reliable guarantee—within Bubble’s plugin environment—that these globals would be available by the time the plugin code needed them. This resulted in race conditions and unpredictable behavior, especially on pages with multiple element instances or quick visibility changes. Furthermore, Bubble does not provide the level of script-loading control needed to coordinate deferred loading in a fully reliable way.

Because these issues produced inconsistent behavior across devices, browsers, and loading scenarios, I chose the current approach. Loading the libraries up front ensures stable, predictable availability for all element instances, avoids timing issues, and benefits from caching after the first load.

This change isn’t planned in the short term, but I may revisit again the loading strategy in a future refactoring.

For now, the most practical mitigation is to include the element(s) only on pages where the functionality is actually needed.