✍ ᴺᴱᵂ ᴾᴸᵁᴳᴵᴺ Google Video AI - Transcribe Video (incl. Automated Google Environment Setup!)

Hi Bubblers !

With this plugin, you can transcribe spoken audio in a video into text and returns blocks of text for each portion of the transcribed audio, along with the speaker within a .MOV, .MPEG4, .MP4, .AVI, or any ffmpeg decodable video file format, provided as input.

The use-case ranges from automated captioning, simple archiving, categorising, enhanced search purposes of your video portfolio to SEO improvement.

The supported language are specified here: Speech-to-Text 支持的语言  |  Cloud Speech-to-Text 文档  |  Google Cloud

The plugin provides :

  • a Visual Element to detect the video duration,
  • a first Workflow Action to trigger the analysis.
  • a second Workflow Action to return the analysis progress rate, completion status, and when completed, a list of transcriptions. For each, it returns a list of words with related timestamps, confidence rate, and the speaker(s).

You can test out our Google Video AI - Transcribe Video Plugin with the live demo here.


Enjoy !
Made with :black_heart: by wise:able
Discover our other Artificial Intelligence-based Plugins


Is there an option for a simplified version in which we just send a request with a http address and get back just the transcript?(without all the extra data) without the do every 5 seconds and all of that?
It’s almost impossible to start adding a do every 5 seconds workflow into a big system, will make things heavy.


Thanks for your message.

As mentioned in the instructions, the implementation of this plugin is asynchronous, which means that a request is sent first, processed in Google Cloud Platform, and once completed, is sent back on requestor request.

The asynchronicity is required for large media such as long audio and especially video, because Bubble.io platform allows an action to run a specific amount of time before being killed by Bubble engine, also known as timing out, typically around 30 seconds.
This duration is simply too short for GCP to process the video file and get the transcript back, hence the asynchronous operation.

Even if it would be synchronous (e.g. what you are requesting) and if there would be no timeout, the action would run and hang the application for dozen of seconds if not minutes, pending completion from the GCP platform, which is not sustainable and against architectural best practices.

You can find more information about these operations here: Long-running operations  |  Cloud Video Intelligence API Documentation

Alternatively, feel free to change the polling interval, we used 5 seconds in our demo but it can be any other value.

Should you required any further info, feel free to DM us to investigate specifically your case.

1 Like

Thanks for the elaborated reply!
Is there an option to make a call and then schedule a workflow say in 2 minutes time(by then most likely the api response is ready) and retrieve the transcription into the desired field?

Sure, simply enter a different value in the Action start interval, as mentioned before:

If you do not want some actions to run when no analysis is expected, use the “Only when” field in the workflow. Then the action will execute only for the test you define, as done in our demo.

I’ll just elaborate my situation and maybe it will clear it out.
I have a screen where a user can upload several videos into, for each of these videos i would like to get a transcribe saved into a field of a data type holding the video url and a transcribe(text) field. The amount of videos is dynamic and changing.
How can that be achieved?

Hey @lankri.erez,

Please provide us your editor link and access to your app in DM, we would like to have a look.


And yes, only for you Bubblers, this plugin now supports speaker diarizarion :love_you_gesture:

This is super exciting and just want I am looking for!

I was wondering if there is a cost (presumingly from google) to make calls out to their API? I see the cost for the plugin but am unfamiliar with google’s side of the house. :slight_smile:

1 Like

Hey @rick.mooberry !

You will find the associated pricing here: Pricing  |  Cloud Speech-to-Text  |  Google Cloud