/
モデルサポートエンタープライズブログ
500以上のAI Model API、オールインワンAPI。CometAPIで。
モデルAPI
開発者
クイックスタートドキュメントAPI ダッシュボード
リソース
AIモデルブログエンタープライズ変更履歴概要
2025 CometAPI. 全著作権所有。プライバシーポリシー利用規約
Home/Models/OpenAI/GPT-4o mini Realtime Preview
O

GPT-4o mini Realtime Preview

入力:$60/M
出力:$240/M
GPT-4o mini Realtime Preview là một mô hình đa phương thức thời gian thực dành cho các trải nghiệm giọng nói và thị giác mang tính tương tác. Mô hình xử lý giọng nói, văn bản và hình ảnh với đầu vào và đầu ra dạng luồng, cùng khả năng gọi công cụ/hàm để thực hiện các hành động có cơ sở. Các trường hợp sử dụng điển hình bao gồm trợ lý giọng nói, xử lý cuộc gọi trực tiếp, tạo phụ đề thời gian thực và hỏi đáp về nội dung thị giác từ camera hoặc màn hình. Các điểm nổi bật về kỹ thuật gồm âm thanh hai chiều, khả năng hiểu thị giác, phản hồi dạng luồng và đầu ra có cấu trúc thông qua các hàm.
商用利用
概要
機能
料金プラン
API
バージョン

Technical Specifications of gpt-4o-mini-realtime-preview

SpecificationDetails
Model IDgpt-4o-mini-realtime-preview
ProviderOpenAI via CometAPI
ModalitiesText, audio, image
Input typesStreaming audio, text messages, image inputs
Output typesStreaming text, synthesized/streamed audio, structured function calls
Core strengthsLow-latency interaction, multimodal understanding, real-time conversation, tool use
Best forVoice assistants, live support calls, captioning, visual Q&A, interactive agents
Function callingSupported
StreamingSupported
Realtime sessionsSupported
Typical interaction patternContinuous bidirectional session with incremental input and output

What is gpt-4o-mini-realtime-preview?

gpt-4o-mini-realtime-preview is a real-time multimodal model designed for fast, interactive experiences where users speak, type, or share visual input and expect immediate responses. It is well suited for applications that need live back-and-forth communication rather than standard single-turn request/response workflows.

The model can process speech, text, and images within the same experience, making it useful for assistants that listen to a caller, inspect on-screen or camera content, and respond in natural language or audio. Because it supports streaming input and output, developers can build systems that feel responsive during ongoing interactions instead of waiting for a full completion.

It also supports tool or function calling, which allows the model to trigger structured actions such as looking up data, calling backend services, or executing workflow steps. This makes gpt-4o-mini-realtime-preview a strong choice for grounded, action-oriented agents in customer support, operations, productivity, and multimodal assistant scenarios.

Main features of gpt-4o-mini-realtime-preview

  • Real-time multimodal interaction: Accepts and responds across speech, text, and images for fluid live experiences.
  • Bidirectional audio: Supports conversational voice interfaces where audio can be streamed in and responses can be streamed back out.
  • Streaming responses: Delivers partial outputs incrementally, reducing perceived latency and improving responsiveness.
  • Vision understanding: Interprets visual inputs such as camera frames, screenshots, or other images during a live session.
  • Function and tool calling: Produces structured calls that let your application connect the model to business logic, databases, or external tools.
  • Interactive agent behavior: Works well for assistants that must maintain turn-by-turn context during active sessions.
  • Live call handling: Useful for phone or web-call scenarios involving fast speech understanding and immediate replies.
  • Real-time captioning and transcription workflows: Can support experiences that convert ongoing speech into usable text in near real time.
  • Structured outputs for actions: Helps applications turn conversational intent into reliable machine-readable instructions.
  • Low-latency user experiences: Optimized for scenarios where responsiveness matters, such as support, coaching, monitoring, and guided workflows.

How to access and integrate gpt-4o-mini-realtime-preview

Step 1: Sign Up for API Key

First, create an account on CometAPI and generate your API key from the dashboard. This key is required to authenticate every request. Store it securely and avoid exposing it in client-side code or public repositories.

Step 2: Connect to gpt-4o-mini-realtime-preview API

The Realtime API uses WebSocket connections. Connect to CometAPI's WebSocket endpoint:

const ws = new WebSocket(
  "wss://api.cometapi.com/v1/realtime?model=gpt-4o-mini-realtime-preview",
  {
    headers: {
      "Authorization": "Bearer " + process.env.COMETAPI_API_KEY,
      "OpenAI-Beta": "realtime=v1"
    }
  }
);

ws.on("open", () => {
  ws.send(JSON.stringify({
    type: "session.update",
    session: {
      modalities: ["text", "audio"],
      instructions: "You are a helpful assistant."
    }
  }));
});

ws.on("message", (data) => {
  console.log(JSON.parse(data));
});

Step 3: Retrieve and Verify Results

The Realtime API streams responses through the WebSocket connection as server-sent events. Listen for response.audio.delta events for audio output and response.text.delta for text. Verify the session is established and responses are streaming correctly.

GPT-4o mini Realtime Previewの機能

GPT-4o mini Realtime Previewのパフォーマンスと使いやすさを向上させるために設計された主要機能をご紹介します。これらの機能がプロジェクトにどのようなメリットをもたらし、ユーザーエクスペリエンスを改善するかをご確認ください。

GPT-4o mini Realtime Previewの料金

GPT-4o mini Realtime Previewの競争力のある価格設定をご確認ください。さまざまな予算や利用ニーズに対応できるよう設計されています。柔軟なプランにより、使用した分だけお支払いいただけるため、要件の拡大に合わせて簡単にスケールアップできます。GPT-4o mini Realtime Previewがコストを管理しながら、お客様のプロジェクトをどのように強化できるかをご覧ください。
コメット価格 (USD / M Tokens)公式価格 (USD / M Tokens)割引
入力:$60/M
出力:$240/M
入力:$75/M
出力:$300/M
-20%

GPT-4o mini Realtime PreviewのサンプルコードとAPI

GPT-4o mini Realtime Previewの包括的なサンプルコードとAPIリソースにアクセスして、統合プロセスを効率化しましょう。詳細なドキュメントでは段階的なガイダンスを提供し、プロジェクトでGPT-4o mini Realtime Previewの潜在能力を最大限に活用できるよう支援します。

GPT-4o mini Realtime Previewのバージョン

GPT-4o mini Realtime Previewに複数のスナップショットが存在する理由としては、アップデート後の出力変動により旧版スナップショットの一貫性維持が必要な場合、開発者に適応・移行期間を提供するため、グローバル/リージョナルエンドポイントに対応する異なるスナップショットによるユーザー体験最適化などが考えられます。各バージョンの詳細な差異については、公式ドキュメントをご参照ください。
version
gpt-4o-mini-realtime-preview
gpt-4o-mini-realtime-preview-2024-12-17

その他のモデル