I am building an AI subtitles tool using Bubble.
We are using Whisper to transcribe videos and return “lines” to bubble. On occasion the transcription has inaccurate words, so I need to allow the user to amend the lines before confirming the subtitle creation process.
the user flow is:
- User uploads video
- Video is sent to our backend server for transcription (via api connector)
- Returns transcription in linelevel format, (see below)
- API workflow on a list - it runs through the linelevel items and creates a new entry in a bubble table for linelevel_info.
- I display the linelevel_info from the bubble table in a repeating group with a text input that is bound to each line of text. This is so that it updates whenever the user makes changes.
- User confirms and the linelevel_info from the bubble db is sent to our backend for processing.
Here is the linelevel_info structure that comes back from Whisper after transcription:
"linelevel_info": [
{
"end": 1.54,
"line": "Stop wasting your",
"start": 0.0
},
{
"end": 2.62,
"line": "time on ads when",
"start": 1.54
}
The issue I’m having is speed. The transcribe API returns the linelevel data in around 10s.
It is then taking 30-60s (sometimes longer if there are more lines) to run the API workflow on a list and create each of the lines in the bubble DB. Seems very slow just to create items in a database?
I may be doing this in the wrong way so am looking for ideas to make this much faster (if there is a way). Ideally I want the lines to show in the repeating group, and be editable for the user, within 15-20s after the transcription is kicked off.