Forum Academy Marketplace Showcase Pricing Features

How can I store this data correctly from a JSON API response?

I’ve currently got a transcription service setup to transcribe video files and a GET request setup on a RG to display the transcript when it’s complete/ready.

However, I’m realizing that I need instead of doing it this way with an API call, I need to store it in the Bubble database. Why? Because users will need to be able to edit the text if the transcription is incorrect - which happens.

Question: How do I structure my database to store this information? I’ve included a sample JSON response below which gives you an idea of the data needed and hierarchy (with the “words” array being one of the most vital parts). It HAS to be done in a way where I can display words one-by-one as I’m currently doing. I’d GREATLY appreciate any ideas :slight_smile:

One db entry per word sounds like a lot of db entries. Maybe not depending on number of vids, length, words spoken, etc ?

Depending on what you want to do, it is possible you may run into compute problems to simply create all the db entries (one per word) for your videos, let alone dealing with them (searching, loading and stringing them together in your interface). Just food for thought.

IMO, I’d first investigate if creating that many entries is feasible and what it does for your performance.

To save a bit on DB compute, I’d look a ways to store the returned json (one sentence + the array of ‘words’ inside it) from your api directly in the db as text and then read and write on the JSON data using a plugin (jsonator, or with custom Js using toolbox).

This approach is considerably more complex though and somewhat advanced since you are not ‘doing it the bubble way’.

I think you’re likely right. Someone else told me after posting this that storing each word separately is not likely to be viable.

However, as a non-developer, I’m struggling to figure out this approach. I’m willing to learn, but would you mind elaborating on this a little?

How would I structure my data type(s) and fields to make this work?

Thank you.

Hey @alex.pethick

Most transcription APIs return JSON like this, and converting it into sentences that are more nicely formatted for Bubble storage/display usually involves some intermediate processing. We have a tutorial of doing this with AWS Transcribe that uses a bit of code (link below, if you’re interested), but that may not be what you’re looking for. I haven’t had much success extracting deeply nested JSON like transcripts into flat objects (like sentences) without using some intermediary code, but I’d love to hear if other folks have managed this within Bubble and without code.

Anyway, to make the text editable, can you just load the sentence (paragraphs[“text”]) into a multi-line input, and maybe store the edited text in a separate field (e.g., text-edited), either triggered when the input changes or the user clicks save?

Thank you @launchable I’ll take a look at the video. However, I need each word of the sentence or even just the whole transcript to display in a separate cell word-by-word so I’m not sure this approach would work, would it? Is there another way to approach this?

I currently have a working prototype using just the API to display the transcript word-by-word, however, it doesn’t allow me the edit function - so I need to store it.

The reason I’m displaying word-by-word in separate cells (instead of just the whole sentence) is:

  • With the API setup, I can access that particular word’s start/end time for precise timing info. Which leads me into this next thing…
  • So a user can click on that exact word and it’ll take them to the specific moment within a video
  • A user can highlight words and the app knows the exact timing of the start of that highlight and the end of the highlight
  • change color of cell / word upon hover
  • And there’s prob other reasons

Ah. That’s trickier. So you will probably need some representation of each word as an object.

What’s your concern with storing the individual words? Performance degradation? If you wanted to simply store the words in the same format as they come back from the API, but you’re worried about ending up with loads of transcript data, you could offload storage to another service (AWS/Firebase/GCP, etc.), and pull in a user’s data as needed.

Alternatively, you could try using a set of “List of items” for each transcript. Something like this

  • words_list == “The, lazy, brown, dog,…”
  • start_times = [0:00, 0:01, 0:03, …]
  • end_times = [0:01, 0:03, 0:04, …]

Then you’d load and link the data as needed on the UI side. This may be a bit of an improvement on the DB side, but probably not on the UI side.

Interesting problem…

2 Likes

like @launchable suggested I would just start with a Thing per word. I don’t think the database will get slow unless you try querying all the words at once (which you will never have to do).

Here’s how I would structure it

  • Transcript → has a list of Word things
  • Change your API call to be an Action instead of Data
  • On your transcript page or whatever, add a button that calls that API and downloads the words to the bubble database - display a loader in the meantime (maybe the api returns the total word count? can you use that?)
  • Display your words as you have them and simply autobind them

Maybe you could add a ‘finalize’ or ‘finish’ transcript feature, or automatially remove unused transcripts & their words after 30 days.

Remember, as long as you keep the original audio you can always re-run the transcript.

If that already works, don’t both with dealing with JSON as that will always be different from handling stuff directly from bubble.

2 Likes

+1 for @Kayami’s suggestions to use Action instead of Data, and Transcript Words

1 Like

Thank you very much for your ideas @launchable and @Kayami. Some follow up questions…

By doing it this way, is there inherently a relationship built between, to use your example, the word “lazy” (words_list) and 0:01 (start_times) and 0:03 (end_times)? I haven’t used this setup before and I guess I’m wondering if I go in the direction, will Bubble know that the start time for a given word / cell is x?

Why is that? Just so I can use a workflow instead?

Would either of you be able to walk me through how to create and achieve this?

Thanks again :pray:

I might look at storing it as JSON in a bubble text field and then having an API Connector call on your own database … Then Bubble can parse it for you when you need it.

3 Likes

@alex.pethick yeah so you can use it in a workflow, since you now will have to ‘download’ all the data. But i think you can also use Get data from an external API so just keep it as data, that way you can also access its individual types in pieces (needed later).

Here are the rough steps

  • In your own UI, you have a way to create a Transcript right? Either use that workflow, or add an emptystate + a button (usually nicer imo because it makes it clear what the app is doing).
  • Then in the workflow, run the Get Transcript from Assembly.ai using the Get data from external api
  • In the next action step, do Schedule a workflow on a list and pass in the Assembly.ai's words, you should be able to set the type of things to Assembly ai word or something similar because you kept the call as Data.
  • Now just create a workflow that runs Create a new thing with your own Word type

Create that data type in your db like this

Word

Text → type text
Confidence → number
Start → number
End → number

This will create 1 Word in your db per word in the Assembly.ai's words array.

Hope this makes sense, kind of tricky to explain consicely.

1 Like

Not exactly, but you can use the index (i.e., the position of the item in the list, 1st, 2nd, 3rd etc.) to “group” them. This would be helpful if you wanted to use JavaScript (via Toolbox) or ListShifter (from @keith).

1 Like

Thanks @Kayami & @launchable - I’m going to try both approaches this weekend and see how I get on :slight_smile:

@NigelG also suggested the following. Has anyone had experience doing this? It also seems like a solid solution from what I understand, but I don’t know how to approach this last bit with manipulating the data and storing it back as a text field.

@alex.pethick Greetings and for all the great audience right here,

I had a piece approach to transcribe speech into text WORDS-TO-WORDS and maybe with you and the great audience we can solve the issue we are going through.

-Subscribe to assemblyai for the token key.
-Bubble API ( I have issue to initialise call | undefined header name).
-Speech to Text plugin and input authorization key.
-Create listing DB of URL file and translated of text(to show/edit) and save.
-Assembly AI WF whenever to put an action of:
1)Convert speech to text(audioURL from DB, webhook of bubble)
2)Get transcribed speech by path of result of step 1;id
3)Make changes to listing of translated by path of get result of step 2;text

NOTE:You will go through an error(HTTP 400)

If you will do the process with the manual URL and ID at WF it will works perfectly, that’s why i am still in the process to solve it, Now i need to initialise the API to try again stoping the error.

Good luck and i am open for idea’s, possible and simple way otherwise i am going to AWS service.