Most scalable, and performant way to structure my database

Hello bubblers,

First time poster here with limited experience in web development and databases, seeking guidance on how to best structure my bubble database for this use-case.

I’m building an application that has:

  1. A video of people speaking
  2. Text transcription of the people speaking in that video

I want the app to do things like highlighting the part of the transcript that is currently being played. So I’ve deployed an API that splits the transcript into sentences after it is transcribed to text.

So for each sentence in the transcription, here is the data I have:

  1. Text for that sentence
  2. Timestamp of when the sentence begins
  3. Timestamp of when the sentence ends

Here are a few ideas I have on how I can store this data:

  1. Store each sentence as an independent row, with the columns being [text, begin, end, nth-sentence], with the ‘nth-sentence’ column being the column that keeps track of the order of sentences.
  2. Store the whole transcript in one row, with the texts, begin timestamp, and end timestamp stored as a list, each with their own column. This is slightly inconvenient, as I am currently using the data APIs to POST the sentences into bubble, and there is a limit on the length of the list (I have too many sentences in the video for bubble to let me POST). But perhaps I can try figure out a workaround or maybe I just try returning it VIA the API and parse the response.
  3. Store the whole transcript with the sentence data in a single row and column, in text format (not list of texts). Then, use the toolbox plugin or other methods (please let me know if there are more efficient methods), to separate the data on run time. e.g. “[{text: …, begin: …, end: …}, so on…]”

I am quite inexperienced in this domain, so I’m sure there could be a much better solution for this. - Any feedback, advice, or ideas would be greatly appreciated, even if it is not directly solving the problem!

Best,
Claude

A row with this data would be good. Then also a Video field linking back to the video it’s related to.

You can always load a video on page, create some custom state with a list for all the transcriptions related to that video. Then point to the custom state list:filtered by timestamp so you’re not doing a search for new text every 2 seconds.

1 Like