Integrating ChatPGT Model and Vision

Hi everyone, I deeply hope someone can help me.
Within my app, DALL-E will randomly generate a picture, showing 3 different objects. The user is asked to write a story based on the given pictures. Finally there is a workflow which checks, if the presented pictures are part of the story. Here comes the problem:
The story will be told in German and therefore the examination, if the given pictures are part of the story, is based on the German words for the generated objects. In order to extract the names of the objects, I want to have the image analysed by ChatGPT Vision and save the names of the objects in a Custom State. When the user has written the story, this story should be compared with the Custom State to check whether all objects are named in the story. How is this done with the Vision API?

Hey @lisustar Do what you want to be doing is:
Set up the Vision API in the connector (make sure it’s set up as an action)
Trigger an action to use the vision API
Save the data into a custom state (all is vision response = text) use Step 2 response to save into the custom state
Another action - I would then make another API call to chat GPT to compare your custom state text with the text that is in the story. ( do this in backend workflow I reckno based on when the input text changes)
If there’s a difference you can then tell the user or report it in the backend whatever you need to do. Also, we have an online community where there are tons of people building awesome Bubble and AI products - it’s free to join here -Lightning Accelerator

Cheers,

Hi @jonathan.hernandez,
thanks so much for you reply. I’ve saved the url of the created image in the custom state. But when I want to use this url in the next call to chat GPT, I receive a failure message saying, that Chat GPT can’t use the url since it’s not able to search in the internet. Do you have an idea how to solve it?

This topic was automatically closed after 70 days. New replies are no longer allowed.