Integrating ChatPGT Model and Vision

lisustar · August 24, 2024, 10:36pm

Hi everyone, I deeply hope someone can help me.
Within my app, DALL-E will randomly generate a picture, showing 3 different objects. The user is asked to write a story based on the given pictures. Finally there is a workflow which checks, if the presented pictures are part of the story. Here comes the problem:
The story will be told in German and therefore the examination, if the given pictures are part of the story, is based on the German words for the generated objects. In order to extract the names of the objects, I want to have the image analysed by ChatGPT Vision and save the names of the objects in a Custom State. When the user has written the story, this story should be compared with the Custom State to check whether all objects are named in the story. How is this done with the Vision API?

jonathan.hernandez · August 26, 2024, 2:10am

Hey @lisustar Do what you want to be doing is:
Set up the Vision API in the connector (make sure it’s set up as an action)
Trigger an action to use the vision API
Save the data into a custom state (all is vision response = text) use Step 2 response to save into the custom state
Another action - I would then make another API call to chat GPT to compare your custom state text with the text that is in the story. ( do this in backend workflow I reckno based on when the input text changes)
If there’s a difference you can then tell the user or report it in the backend whatever you need to do. Also, we have an online community where there are tons of people building awesome Bubble and AI products - it’s free to join here -Lightning Accelerator

Cheers,

lisustar · August 26, 2024, 7:22am

Hi @jonathan.hernandez,
thanks so much for you reply. I’ve saved the url of the created image in the custom state. But when I want to use this url in the next call to chat GPT, I receive a failure message saying, that Chat GPT can’t use the url since it’s not able to search in the internet. Do you have an idea how to solve it?

system · November 2, 2024, 10:37pm

This topic was automatically closed after 70 days. New replies are no longer allowed.

Topic		Replies	Views
:robot: ᴺᴱᵂ ᴾᴸᵁᴳᴵᴺ Google AI Gemini On Your Data [Native RAG] (Prompt/Model Hiding, Vision, Tools (Functions), Token Usage & Markdown + LaTeX, Prompt Caching) [Keeps your keys secure] Plugins	9	1054	February 13, 2025
Sending multiple dynamic image URL via API APIs	8	252	May 30, 2024
Issue with Retrieving Text Response from ChatGPT Vision API APIs	3	771	November 8, 2023
[NEW PLUGIN] ChatGPT Async Plugin by did.lu Plugins	0	342	September 4, 2023
Sending an Image to GPT/Dalle Api APIs	0	411	August 20, 2023

Integrating ChatPGT Model and Vision

Related topics