Any bubble compatible options for AI querying your data?

My team is looking to perform AI queries on our databases. Would this look like storing our data with something like Xano, and periodically exporting to an AI tool to perform queries on it?

The silence is staggering! We are working with @cerum on trying to find ways to ingest Bubble data directly to an LLM. No progress thus far as we seem to be forced to create a file with the Bubble data and then ingest the file. Feel free to reach out to either of us to get an update.

To be clear, this doesn’t create an LLM with your data, but it can help you ingest and manage which parts of your data go into an AI model which is the next best thing at the moment, especially when using a RAG approach.

I’m looking to do the same. If there is a solution (or if I find one), I’ll post it here as well.

1 Like

What other method would theoretically solve for what you are trying to do?!?

1 Like

@sanastasi @bidlist @saviorabrams
I might be missing some context but there is definitely a way to query dynamic data from Bubble to LLM upon user query.

This is called “Functions”. Some functions are called when the LLM detects that the requested data from the conversation requires to get data from such functions.

At this point, nothing prevent you from feeding back the function results to the LLM which will then answer with the queried data.

I have implemented this kind of use-cases using some of the following plugins supporting functions calling along with demo.:

@redvivi @bidlist @saviorabrams the term “querying data” can mean many things. In my instance (and the context of this thread I think) we mean the use of unstructured Bubble data to answer AI queries. For this to happen we need two things. The first is a way to vectorize the Bubble data and store it and the second is a way to do a Vector Similarity Search on that vectorized data so we can pass it to the AI model. Functions are great only if the initial query gives you the objective parameters you need to return an empirical datapoint.

I’ve built a workable RAG model with one Bubble partner (@cerum) who helped us create a database off platform for the vectorized data. It works, but it is in Supabase and at scale, it is too cumbersome. We have to track if a Bubble record being used as relevant data has changed and then update its vectorized cousin in Supabase every time it changes. This has to be done by converting the Bubble data into a file and then sending it to a chunking API. It works, but its a LOT of overhead to manage and my customers ask, “Where are you sending our data?”

I’m working with another Bubble partner (@launchable) who believes the vectorized data can be stored in a Bubble text field and then passed to the AI model by running a Vector Similarity Search through a plugin. While the actual processing is done off platform, the vectorized data is within the Bubble database and the VSS action is just a workflow action. It’s an elegant solution that might prove to be more scalable than the first. The problem is that it simply doesn’t work at this point in time. If it does, I’ll update this thread.

So, in summary, unless there is another partner or plugin out there that has this cracked, we can still not use Bubble resident data in a true RAG workflow unless you want to create a separate database off platform to hold the vectorized version of it. I would have thought that this is something Bubble would want to solve as any other approach makes Bubble a non-starter for anyone wanting to use the Bubble data they have collected as the basis for answering AI queries.

Neo4j. Insane stuff. Starting to learn and implement it myself for future products. Will handle and query your data much faster. Built for AI.

I’m really adamant on learning the Bloom system.

As far as Bubble implementation, I just feel it wouldn’t be as performant (as of right now) for a vectorized DB. I can say I’m sure they’re working on that possibility in the future likely.

And yes, you could make it compatible with Bubble, however, you would need to create an API to query.

I know this is an option but a list of 1536 numbers stored on each Thing, and having to send all of that data out of Bubble’s servers seems bad for performance and very inefficient.

Scalable? I don’t know… You literally would have to send ALL of the vectors you want to query to this other server, which could easily be very large (and by extension, costly).

It is really trivial to set up.

‘Memory’ data type which corresponds to a vector in vector DB. Memory has content (text) and pineconeID (if using pinecone) which is the Pinecone vector ID (used for editing/deletion). Upsert with metadata for advanced querying.

Setup backend trigger to run when thing is created/updated/deleted as necessary. Or, if you only want certain stuff to be added to the index, just set up the logic another way (for example, I have plenty of apps where users upload files - we extract the text, chunk, then upsert, all in the backend, and have a DB trigger that runs when the file or its metadata is modified).

Re. file conversion, even if you can’t code, Claude 3.5 will code an effective file conversion serverless function for you. My client’s apps use a function that extracts virtually any file, includes OCR, can be hosted in any country, and has negligible costs.

As long as you effectively modularise your logic, keeping it in the backend, and using DB triggers if your use case warrants it, using Bubble + vector DB is perfectly stable, reliable, and efficient.

2 Likes

Besides technicalities, what’s your use-case querying all possible data in a Bubble app?

Our use case is to allow users to be able to query previous answers to a problem so they can be informed of the best way to solve their problem. This requires all previous answers to be vectorized. It’s relatively simple in theory, but as this thread highlights, it is not easy in Bubble and to do it well at the moment requires storage and workflow coordination with another database. Add the AI model in there you have a minimum of 3 systems. Having a good VSS or series of functions to run a proper ingest process and you now have 4. It’s all doable. It is relatively easy to set up, but it’s unnecessarily complicated, not competitive in the current landscape and probably not something that will scale well.

I promise you, it can… I’m working on an app with millions of chunks of texts which work great with no problem…

  • question data type
  • set up CRUD calls to Pinecone
  • use DB trigger to upsert/update/delete embeddings when Question is created/edited/deleted
  • query knowledge base using an LLM when you want an answer

Bubble x Pinecone x LLM, that’s it.