Video Analysis with OpenAI (extract frames)

a.villabruna · May 29, 2024, 3:44pm

Hi Bubble community - I am building a video library app, and I need to run a video through OpenAI to provide me with tags. I know that OpenAI doesn’t offer video files as input, but there’s a workaround to extract frames and then upload these to be analyzed. OpenAI themselves refer to this article for this use case, however, this requires python and that goes beyond my abilities.
Processing and narrating a video with GPT's visual capabilities and the TTS API | OpenAI Cookbook

Alternatively, if someone has experience implementing such a feature with Google Cloud (Vertex AI, Gemini, Video Intelligence API, etc), any help is welcome.

Thank you!

redvivi · May 29, 2024, 6:25pm

Is your input a live stream or a video file?
What do you want to run as analysis?

zamadshakil · May 29, 2024, 6:39pm

it can be easily done by Azure OpenAI API with gpt-4-vision with enhancements, that support video as an input that will take main important frames from video, and then it give to openai. So, you don’t need to do code by yourself, just use Azure OpenAI API

a.villabruna · May 29, 2024, 6:42pm

The input is a mp4 file, around 15-30 seconds. Basically, I want to use OpenAI to generate tags (e.g. about what’s in the video), that I then can store in the bubble database to better search for these videos.

a.villabruna · May 29, 2024, 6:43pm

That’s a good hint. Thank you. Let me try that.

redvivi · May 29, 2024, 6:48pm

For indexing purposes, the backend processes are better suited.

Would suggest rather to use a service already meeting your requirements and built for this purpose, running at a cheaper price.

See

a.villabruna · May 30, 2024, 4:25pm

Is this something you have done and you could share the setup in azure? Haven’t been able to upload video files there either

zamadshakil · May 30, 2024, 8:03pm

After getting access to Azure OpenAI, then
just deploy gpt-4-vision(gpt-4v) on Azure OpenAI in deployment section,
then open chat
where you see this interface

then enable Azure AI Vision,
which is

then click on view code, select CURL code,

a.villabruna · May 31, 2024, 11:40am

Thank you everyone for your help. Both approaches will work, but were too complex for my use case. I needed to deploy Storage and AI Search to enable Vision, which was expensive at>$70/month. The AWS method would also work, but it extracted too many frames and was too slow.

I went for a much simpler approach. I use the Shotstack API to extract a few (3-5) frames from a video and then ingest these frames into the OpenAI API. As I am not expecting high volumes for my prototype, this approach was cheap, quick, and flexible (I can now use all OpenAI features like describing a scene, etc.)

system · August 7, 2024, 3:45pm

This topic was automatically closed after 70 days. New replies are no longer allowed.

Topic		Replies	Views
How to integrate your Bubble app with OpenAI / GPT, using JSON output and functions Tips	0	649	July 23, 2023
Tutorial Video - OpenAI Assistants in Bubble Tips	1	641	January 3, 2024
Seeking Help with Bubble.io and OpenAI Integration Database	1	834	September 18, 2023
How to create a text-to-video generator in Bubble? Need help	1	864	February 27, 2023
OpenAI Vision API: can't pass image from Bubble APIs	1	53	February 12, 2025

Video Analysis with OpenAI (extract frames)

Related topics