/
ModelliSupportoAziendaBlog
500+ API di Modelli AI, Tutto In Una Sola API. Solo In CometAPI
API dei Modelli
Sviluppatore
Avvio RapidoDocumentazioneDashboard API
Risorse
Modelli di Intelligenza ArtificialeBlogAziendaRegistro delle modificheChi siamo
2025 CometAPI. Tutti i diritti riservati.Informativa sulla PrivacyTermini di Servizio
Home/Models/OpenAI/GPT-4o mini Realtime Preview
O

GPT-4o mini Realtime Preview

Ingresso:$60/M
Uscita:$240/M
GPT-4o mini Realtime Preview er en multimodal model i realtid til interaktive tale- og visuelle oplevelser. Den håndterer tale, tekst og billeder med streaming-input og -output samt værktøjs-/funktionskald til forankrede handlinger. Typiske anvendelser omfatter stemmeassistenter, håndtering af liveopkald, realtidsundertekster og besvarelse af visuelle spørgsmål baseret på kamera- eller skærmindhold. Tekniske højdepunkter omfatter tovejslyd, visuel forståelse, streamede svar og struktureret output via funktioner.
Uso commerciale
Panoramica
Caratteristiche
Prezzi
API
Versioni

Technical Specifications of gpt-4o-mini-realtime-preview

SpecificationDetails
Model IDgpt-4o-mini-realtime-preview
ProviderOpenAI via CometAPI
ModalitiesText, audio, image
Input typesStreaming audio, text messages, image inputs
Output typesStreaming text, synthesized/streamed audio, structured function calls
Core strengthsLow-latency interaction, multimodal understanding, real-time conversation, tool use
Best forVoice assistants, live support calls, captioning, visual Q&A, interactive agents
Function callingSupported
StreamingSupported
Realtime sessionsSupported
Typical interaction patternContinuous bidirectional session with incremental input and output

What is gpt-4o-mini-realtime-preview?

gpt-4o-mini-realtime-preview is a real-time multimodal model designed for fast, interactive experiences where users speak, type, or share visual input and expect immediate responses. It is well suited for applications that need live back-and-forth communication rather than standard single-turn request/response workflows.

The model can process speech, text, and images within the same experience, making it useful for assistants that listen to a caller, inspect on-screen or camera content, and respond in natural language or audio. Because it supports streaming input and output, developers can build systems that feel responsive during ongoing interactions instead of waiting for a full completion.

It also supports tool or function calling, which allows the model to trigger structured actions such as looking up data, calling backend services, or executing workflow steps. This makes gpt-4o-mini-realtime-preview a strong choice for grounded, action-oriented agents in customer support, operations, productivity, and multimodal assistant scenarios.

Main features of gpt-4o-mini-realtime-preview

  • Real-time multimodal interaction: Accepts and responds across speech, text, and images for fluid live experiences.
  • Bidirectional audio: Supports conversational voice interfaces where audio can be streamed in and responses can be streamed back out.
  • Streaming responses: Delivers partial outputs incrementally, reducing perceived latency and improving responsiveness.
  • Vision understanding: Interprets visual inputs such as camera frames, screenshots, or other images during a live session.
  • Function and tool calling: Produces structured calls that let your application connect the model to business logic, databases, or external tools.
  • Interactive agent behavior: Works well for assistants that must maintain turn-by-turn context during active sessions.
  • Live call handling: Useful for phone or web-call scenarios involving fast speech understanding and immediate replies.
  • Real-time captioning and transcription workflows: Can support experiences that convert ongoing speech into usable text in near real time.
  • Structured outputs for actions: Helps applications turn conversational intent into reliable machine-readable instructions.
  • Low-latency user experiences: Optimized for scenarios where responsiveness matters, such as support, coaching, monitoring, and guided workflows.

How to access and integrate gpt-4o-mini-realtime-preview

Step 1: Sign Up for API Key

First, create an account on CometAPI and generate your API key from the dashboard. This key is required to authenticate every request. Store it securely and avoid exposing it in client-side code or public repositories.

Step 2: Connect to gpt-4o-mini-realtime-preview API

The Realtime API uses WebSocket connections. Connect to CometAPI's WebSocket endpoint:

const ws = new WebSocket(
  "wss://api.cometapi.com/v1/realtime?model=gpt-4o-mini-realtime-preview",
  {
    headers: {
      "Authorization": "Bearer " + process.env.COMETAPI_API_KEY,
      "OpenAI-Beta": "realtime=v1"
    }
  }
);

ws.on("open", () => {
  ws.send(JSON.stringify({
    type: "session.update",
    session: {
      modalities: ["text", "audio"],
      instructions: "You are a helpful assistant."
    }
  }));
});

ws.on("message", (data) => {
  console.log(JSON.parse(data));
});

Step 3: Retrieve and Verify Results

The Realtime API streams responses through the WebSocket connection as server-sent events. Listen for response.audio.delta events for audio output and response.text.delta for text. Verify the session is established and responses are streaming correctly.

Funzionalità per GPT-4o mini Realtime Preview

Esplora le caratteristiche principali di GPT-4o mini Realtime Preview, progettato per migliorare le prestazioni e l'usabilità. Scopri come queste funzionalità possono beneficiare i tuoi progetti e migliorare l'esperienza utente.

Prezzi per GPT-4o mini Realtime Preview

Esplora i prezzi competitivi per GPT-4o mini Realtime Preview, progettato per adattarsi a vari budget e necessità di utilizzo. I nostri piani flessibili garantiscono che paghi solo per quello che usi, rendendo facile scalare man mano che i tuoi requisiti crescono. Scopri come GPT-4o mini Realtime Preview può migliorare i tuoi progetti mantenendo i costi gestibili.
Prezzo Comet (USD / M Tokens)Prezzo Ufficiale (USD / M Tokens)Sconto
Ingresso:$60/M
Uscita:$240/M
Ingresso:$75/M
Uscita:$300/M
-20%

Codice di esempio e API per GPT-4o mini Realtime Preview

Accedi a codice di esempio completo e risorse API per GPT-4o mini Realtime Preview per semplificare il tuo processo di integrazione. La nostra documentazione dettagliata fornisce una guida passo dopo passo, aiutandoti a sfruttare appieno il potenziale di GPT-4o mini Realtime Preview nei tuoi progetti.

Versioni di GPT-4o mini Realtime Preview

Il motivo per cui GPT-4o mini Realtime Preview dispone di più snapshot può includere fattori potenziali come variazioni nell'output dopo aggiornamenti che richiedono snapshot precedenti per coerenza, offrire agli sviluppatori un periodo di transizione per l'adattamento e la migrazione, e diversi snapshot corrispondenti a endpoint globali o regionali per ottimizzare l'esperienza utente. Per le differenze dettagliate tra le versioni, si prega di fare riferimento alla documentazione ufficiale.
version
gpt-4o-mini-realtime-preview
gpt-4o-mini-realtime-preview-2024-12-17

Altri modelli