OpenAI Cascading (FrugalGPT) in Bubble

Hey everyone,
I am currently developing an tool using the OpenAi API to make request. I am trying to reduce costs by using cheaper models while maintaining accuracy and have come accross the cascading (FrugalGPT) that can reduce cost by up to 60% or so.
This basically works by using a cheap model and only when that is overwhelmed going to higher cost models to answer querrys (as of my understanding).
I am not sure how to best implement this in Bubble, does anyone have any ideas how to efficiently do that? I though about just adding a confidence score metric as a tool in my request to GPT and then something like if confidence lower than 3 - go to higher model.
The problem here might be the total time to answer a prompt, as multiple api calls need to be completed in sequence increasing total time needed (bad for user happiness :frowning: ), and also increasing the WU potentially making it less efficient. Also false confidence as cheap models also have might be a problem in determining the actual confidence to accuracy rating.

Any thoughts or advice on this would be greatly appreciated.