Hi,
I’ve just released a new plugin… Avaturn Live … pretty excited about this because I’ve been waiting a while for it, and it’ll end up in my own site soon.
The tech is pretty cool, Avaturn have finally got photo realistic generative video avatars up and running, and the plugin basically brings them to life so you can use them in Bubble. Instead of chatting with AI through normal text inputs etc, you can use this to have natural voice conversations with avatars instead.
It uses a brand new model which Avaturn publicly released just a few days ago. The setup combines OpenAI’s Realtime speech-to-speech API with this AVTR-1 neural avatar model to create live, photorealistic digital humans that listen, react, and respond in real time. Every frame is generated live at really low latency and can capture natural active listening behaviours and conversational reactions too.
Demo and Instructions:
https://demo-plugins-13689.bubbleapps.io/version-test/avaturn-live
Plugin:
https://bubble.io/plugin/avaturn-live-1779881348922x707660614696960000
I’ve created the demo page so that you can use your own API keys, or if you want to use mine for testing then that’s fine, but you’ll have a limited number of sessions available and only have 60 seconds of chat so hopefully everyone can have a play with it. I think I’m more excited to just have others try it than advertise the plugin, but both is a bonus! ![]()
Features
Core experience
-
Real-time, voice-to-voice conversation: users speak out loud and the avatar listens, thinks, and replies in a natural voice.
-
Photo-realistic avatar streamed live into the browser, lip-synced to the spoken reply.
-
Powered by the OpenAI Realtime API for low-latency, natural back-and-forth.
The avatar and its persona
-
Configurable personality through a plain-English instructions field, with support for OpenAI stored prompts.
-
Choice of avatar face.
-
A dynamic voice field with a selection of voices, settable per user or per session.
-
A language setting that anchors the spoken language from the very first word, so the avatar never opens in the wrong one.
-
Transparent-background option, to composite the avatar over your own page design.
-
Adjustable voice-activity detection, so you control how eagerly the avatar takes its turn.
Audio and devices
-
Microphone controls: mute, unmute, and toggle.
-
Live device pickers for microphone and speaker, delivered as typed lists that drop straight into a repeating group.
-
Option to auto-populate the device lists, with no manual refresh needed.
Security
-
Session tokens are created server-side, so no long lived API key ever reaches the browser.
-
Optional per-call key override fields, for apps where each user supplies their own keys.
Session control and limits
-
Built-in session duration timer as a state, ideal for time-limited or metered sessions.
-
Per-visitor connect limit backed by a cookie, with a configurable reset window, to cap how many times someone can start a session.
-
Configurable maximum session duration and idle timeout.
Building your interface
-
A full set of states and events for status and chat-style UIs, with no JSON parsing needed.
-
Events fire when the avatar starts and stops speaking, when the mic is muted, when a session goes idle, when devices refresh, and more.
-
Debug mode that logs the session lifecycle to the console while you’re setting up.
Legacy text-echo mode
-
App driven speech: send the avatar lines to speak for scripted greetings, announcements, or guided flows.
-
Actions to make the avatar speak text, cancel speech mid-sentence, and switch voice.
Actions (ten in total)
-
Server-side: Create Avatar Session, Terminate Avatar Session.
-
Live avatar: Connect, Disconnect.
-
Audio: Set Microphone, Refresh Devices, Set Device.
-
Text-echo: Speak Text, Cancel Speech, Change Voice.
I’ll be supporting the plugin and actively watching for updates from Avaturn about this, so if you have any questions, run into any issues etc, just give me a shout.
I hope it is useful.
Paul

