Which part are you still having problems with, the authentication? I may be able to help you because I was able to get it working.
As far as the file format goes, that is correct that the API allows you to specifiy other file formats, but more work needs to be done in order to actually play the API response. The audio content comes back as base 64 encoded because it is a JSON response and JSON does not support binary data (all of those file formats are intended to take binary data). I assumed this meant I was going to have to decode the data to binary, save it to n mp3 file, and feed the mp3 file to bubble’s audio player but I actually found a much simpler method.
This is called a data URI and will save successfully to the database and work as an audio source. The implementation of this for Google Text to Speech could look like this.
data:audio/mp3;base64, Google Text to Speech API Call's - audio content
Where “Google Text to Speech API Call’s - audio content” is a bubble API call with your parameters returning the base 64 encoded response.