OpenAI today announced that GPT-Realtime voice model is now available, supporting image input, marking the Realtime API’s move from beta to general availability for production voice agents. The release positions GPT-Realtime as a low-latency, speech-to-speech model that can run two-way voice conversations while also grounding responses in images supplied during a session. OpenAI describes gpt-realtime […]