ModellerPriserVirksomhed
500+ AI Model API, Alt I Én API. Kun I CometAPI
Modeller API
Udvikler
Hurtig StartDokumentationAPI Dashboard
Virksomhed
Om osVirksomhed
Ressourcer
AI-modellerBlogÆndringslogSupport
ServicevilkårPrivatlivspolitik
© 2026 CometAPI · All rights reserved
Home/Models/OpenAI/gpt-realtime-mini
O

gpt-realtime-mini

Indtast:$0.48/M
Output:$0.96/M
En omkostningseffektiv version af real-time GPT—i stand til at svare på lyd- og tekstinput i realtid via WebRTC-, WebSocket- eller SIP-forbindelser.
Ny
Kommersiel brug
Oversigt
Funktioner
Priser
API
Versioner

Technical Specifications of gpt-realtime-mini

SpecificationDetails
Model IDgpt-realtime-mini
Model typeRealtime multimodal model
DescriptionAn economical version of the real-time GPT—capable of responding to audio and text inputs in realtime via WebRTC, WebSocket, or SIP connections.
Input modalitiesText, audio, image
Output modalitiesText, audio
Context window32,000 tokens
Max output tokens4,096 tokens
Supported interfacesWebRTC, WebSocket, SIP
Supported featuresFunction calling supported; structured outputs, fine-tuning, distillation, and predicted outputs not supported
Recommended useLow-latency voice agents, realtime multimodal applications, and cost-sensitive interactive experiences

What is gpt-realtime-mini?

gpt-realtime-mini is a cost-efficient realtime model designed for applications that need fast, natural interaction with users through live audio and text. It is intended for low-latency multimodal experiences, allowing developers to build assistants that can listen, respond, and stream output in realtime rather than relying on slower multi-step pipelines.

Compared with larger realtime variants, gpt-realtime-mini is positioned as the economical option for developers who want realtime speech and text capabilities while managing cost and maintaining responsive performance. It works across browser, server, and telephony-style connection patterns through WebRTC, WebSocket, and SIP.

Main features of gpt-realtime-mini

  • Realtime audio and text interaction: Supports low-latency conversations with streaming input and output, making it suitable for live assistants, voice bots, and interactive agents.
  • Cost-efficient deployment: Positioned as an economical version of the realtime model family, making it attractive for high-volume or budget-sensitive applications.
  • Multiple connection methods: Can be integrated through WebRTC for browser clients, WebSocket for server-side systems, and SIP for telephony or VoIP scenarios.
  • Multimodal input support: Accepts text, audio, and image input, enabling richer user interactions and more flexible application design.
  • Speech-capable output: Produces both text and audio output, which is useful for conversational interfaces and spoken response systems.
  • Function calling support: Supports function calling, allowing applications to connect the model to tools, workflows, or backend actions during realtime sessions.
  • Built for voice agents: Well suited for speech-to-speech assistants and realtime customer interaction experiences where interruption handling and fast turn-taking matter.

How to access and integrate gpt-realtime-mini

Step 1: Sign Up for API Key

To get started, sign up on CometAPI and generate your API key from the dashboard. Once you have your key, keep it secure and store it in your environment variables for server-side use.

Step 2: Connect to gpt-realtime-mini API

The Realtime API uses WebSocket connections. Connect to CometAPI's WebSocket endpoint:

const ws = new WebSocket(
  "wss://api.cometapi.com/v1/realtime?model=gpt-realtime-mini",
  {
    headers: {
      "Authorization": "Bearer " + process.env.COMETAPI_API_KEY,
      "OpenAI-Beta": "realtime=v1"
    }
  }
);

ws.on("open", () => {
  ws.send(JSON.stringify({
    type: "session.update",
    session: {
      modalities: ["text", "audio"],
      instructions: "You are a helpful assistant."
    }
  }));
});

ws.on("message", (data) => {
  console.log(JSON.parse(data));
});

Step 3: Retrieve and Verify Results

The Realtime API streams responses through the WebSocket connection as server-sent events. Listen for response.audio.delta events for audio output and response.text.delta for text. Verify the session is established and responses are streaming correctly.

Priser for gpt-realtime-mini

Udforsk konkurrencedygtige priser for gpt-realtime-mini, designet til at passe til forskellige budgetter og brugsbehov. Vores fleksible planer sikrer, at du kun betaler for det, du bruger, hvilket gør det nemt at skalere, efterhånden som dine krav vokser. Opdag hvordan gpt-realtime-mini kan forbedre dine projekter, mens omkostningerne holdes håndterbare.
Comet-pris (USD / M Tokens)Officiel Pris (USD / M Tokens)Rabat
Indtast:$0.48/M
Output:$0.96/M
Indtast:$0.6/M
Output:$1.2/M
-20%

Eksempelkode og API til gpt-realtime-mini

Få adgang til omfattende eksempelkode og API-ressourcer for gpt-realtime-mini for at strømline din integrationsproces. Vores detaljerede dokumentation giver trin-for-trin vejledning, der hjælper dig med at udnytte det fulde potentiale af gpt-realtime-mini i dine projekter.

Versioner af gpt-realtime-mini

Årsagen til, at gpt-realtime-mini har flere øjebliksbilleder kan omfatte potentielle faktorer såsom variationer i output efter opdateringer, der kræver ældre øjebliksbilleder for konsistens, at give udviklere en overgangsperiode til tilpasning og migration, og at forskellige øjebliksbilleder svarer til globale eller regionale slutpunkter for at optimere brugeroplevelsen. For detaljerede forskelle mellem versioner, henvises der til den officielle dokumentation.
version
gpt-realtime-mini