You can do all with bubble and some API’s to OpenAI (embedd creation/LLM) and Pinecone (vectorstore, semantic search).
I have started doing it, the only thing that is tricky is the chunking side of it that is probably part of the reason why you want langchain.
I’m not a coder but from my research into this, chunking has been a part of a developers toolset for sometime, which means we can build a plugin to chunk for us in JS.
But im not a coder, so I asked ChatGPT4 for some helping building a plugin to chunk and it did it perfectly, with overlapping and everything. I’m not confident enough to publish this plugin as I just copy/pasted some code and tested it a little, I’m happy to send the code though.
Once chunking is done, you’ll have a list of the chunked parts of a document, then you need to:
- embed each chunk using OpenAI’s embedding API
- POST the resulting embedding into a Pincone API Upsert call. Add the chunked data into the metadata called something like “Content”.
- Now vector store is ready to Query and add to your LLM conversation system.
This is super high level but you don’t need langchain or another DB to get this process up and running.
The way I learnt was I just watched heaps of videos from Pinecone Youtube of developer’s showcasing and then doing a Q&A, this helped me understand all the concepts fast.
BTW, there are many different methodologies on how to use a vectorstore, I showed you one where you put the content in the metadata, BUT that has its own pro’s and and con’s, as that added content will make the process ‘simple’ but will also slow down the query as it’s more content to fetch. Other people only add the UID of the object (thing), and keep the content in their own DB (for us that could be bubble).
Like you said the tech is new right now and next to nobody KNOWS what the ‘right’ way to go about this. There are so many factors where people can test their creativity, but just know that you most likely wont find the ONE way to do this because there are many roads to create a similar process.