•   3 months ago

Can we use the Stable gemini-live-2.5-flash-native-audio model on Vertex AI?

Hi everyone,

I’m currently building a real-time voice agent using the Google GenAI SDK (google.adk.agents).

I started development using the default preview model via AI Studio (gemini-2.5-flash-native-audio-preview-12-2025). The core integration works great, but because it is an experimental preview model, I am sometimes hitting server-side WebSocket disconnects mid-conversation (specifically 1008 Policy Violations and 1011 Internal Errors).

To mitigate these API connection drops, I built an auto-reconnect system. However, this causes a context loss issue. When the WebSocket reconnects, it establishes a brand-new session with the Gemini Live servers, and the AI agent forgets the conversational history of the previous interactions.

Because of this context loss, I would like to switch my backend to use the General Availability (GA) enterprise version of the model hosted on Google Cloud Vertex AI to ensure stability: gemini-live-2.5-flash-native-audio

My Question: Am I allowed to use the GA stable gemini-live-2.5-flash-native-audio model via Vertex AI for this challenge? Or does the challenge strictly require me to use the experimental models via AI Studio?

Thanks!

  • 1 comment

  • Manager   •   3 months ago

    Hi there,

    Thanks for reaching out. The Rules say "Entrants must develop a NEW next-generation AI Agent that utilizes multimodal inputs and outputs. The project must move beyond simple text-in/text-out interactions. It should leverage Google’s Live API with the creative power of video/image generation to solve complex problems or create entirely new user experiences. All projects must leverage a Gemini model, agents must be built using Google GenAI SDK or Agent Development Kit, and projects must use at least one Google Cloud service." There is no strict requirement for specific models so you can make this change for more stability. Good luck!

Log in or sign up for Devpost to join the conversation.