Technical Specifications of `whisper-1`

Specification	Details
Model ID	`whisper-1`
Model type	Speech-to-text and speech translation
Primary use cases	Audio transcription, multilingual speech recognition, speech translation into English
Input modality	Audio
Output modality	Text
Supported endpoints	`/v1/audio/transcriptions`, `/v1/audio/translations`
Streaming support	Not supported for `whisper-1`
Prompting support	Yes, with limited prompt control for formatting, punctuation, and style
Language capability	Multilingual speech recognition and language identification
Typical integration format	File upload via multipart form data
Common audio formats	`m4a`, `mp3`, `mp4`, `mpeg`, `mpga`, `wav`, `webm`
Best fit for	Converting spoken content into readable text or English translations

What is `whisper-1`?

whisper-1 is a speech recognition model available through CometAPI for turning audio into text and creating translations from spoken audio into English. It is designed for developers who need reliable transcription for recorded speech, interviews, meetings, voice notes, subtitles, and multilingual audio workflows.

The model is well suited for applications that need automatic speech recognition across multiple languages. It can transcribe audio in the original language or translate spoken content into English, making it useful for global products, media processing pipelines, support tools, and accessibility solutions.

Because whisper-1 works on uploaded audio files and returns text output, it fits naturally into backend automation, content indexing, caption generation, search enrichment, and analytics pipelines.

Main features of `whisper-1`

Speech-to-text transcription: Converts spoken audio into written text for documents, captions, archives, and application workflows.
Speech translation: Creates English text translations from non-English spoken audio, simplifying multilingual content processing.
Multilingual recognition: Supports recognition across many languages, making it practical for international and cross-region deployments.
Prompt-assisted formatting: Accepts prompts that can help guide punctuation, capitalization, terminology, and transcript style.
File-based API workflow: Works well with uploaded audio files, making it easy to integrate into batch jobs, media systems, and backend services.
Language identification support: Can be used in workflows where detecting or handling multiple spoken languages is important.
Strong fit for content operations: Useful for subtitle generation, searchable transcript creation, customer call logging, interview processing, and voice-note conversion.

How to access and integrate `whisper-1`

To start using whisper-1, first create an account on CometAPI and generate your API key from the dashboard. After logging in, go to the API management section, create a new key, and store it securely. This key will be required to authenticate every request you send to the whisper-1 API.

Step 2: Send Requests to `whisper-1` API

Once you have your API key, you can send requests to the CometAPI endpoint using the whisper-1 model ID. Include your API key in the Authorization header and specify whisper-1 as the target model. For speech workflows, send an audio file to the appropriate transcription or translation endpoint.

curl --request POST \
  --url https://api.cometapi.com/v1/audio/transcriptions \
  --header "Authorization: Bearer YOUR_COMETAPI_KEY" \
  --header "Content-Type: multipart/form-data" \
  --form "model=whisper-1" \
  --form "file=@/path/to/audio.mp3"

For translation workflows, use the translation endpoint with the same model ID:

curl --request POST \
  --url https://api.cometapi.com/v1/audio/translations \
  --header "Authorization: Bearer YOUR_COMETAPI_KEY" \
  --header "Content-Type: multipart/form-data" \
  --form "model=whisper-1" \
  --form "file=@/path/to/audio.mp3"

Step 3: Retrieve and Verify Results

After the request is processed, CometAPI will return the generated text result for your whisper-1 job. Review the response to confirm transcript quality, language handling, punctuation, and completeness. If needed, refine your audio preprocessing or prompting approach and resend the request to improve output consistency for your production use case.

Technical Specifications of whisper-1

Specification	Details
Model ID	`whisper-1`
Model type	Speech-to-text and speech translation
Primary use cases	Audio transcription, multilingual speech recognition, speech translation into English
Input modality	Audio
Output modality	Text
Supported endpoints	`/v1/audio/transcriptions`, `/v1/audio/translations`
Streaming support	Not supported for `whisper-1`
Prompting support	Yes, with limited prompt control for formatting, punctuation, and style
Language capability	Multilingual speech recognition and language identification
Typical integration format	File upload via multipart form data
Common audio formats	`m4a`, `mp3`, `mp4`, `mpeg`, `mpga`, `wav`, `webm`
Best fit for	Converting spoken content into readable text or English translations

What is whisper-1?

Because whisper-1 works on uploaded audio files and returns text output, it fits naturally into backend automation, content indexing, caption generation, search enrichment, and analytics pipelines.

Main features of whisper-1

Speech-to-text transcription: Converts spoken audio into written text for documents, captions, archives, and application workflows.

Speech translation: Creates English text translations from non-English spoken audio, simplifying multilingual content processing.

Multilingual recognition: Supports recognition across many languages, making it practical for international and cross-region deployments.

Prompt-assisted formatting: Accepts prompts that can help guide punctuation, capitalization, terminology, and transcript style.

File-based API workflow: Works well with uploaded audio files, making it easy to integrate into batch jobs, media systems, and backend services.

Language identification support: Can be used in workflows where detecting or handling multiple spoken languages is important.

Strong fit for content operations: Useful for subtitle generation, searchable transcript creation, customer call logging, interview processing, and voice-note conversion.

How to access and integrate whisper-1

Step 2: Send Requests to whisper-1 API

curl --request POST \ --url https://api.cometapi.com/v1/audio/transcriptions \ --header "Authorization: Bearer YOUR_COMETAPI_KEY" \ --header "Content-Type: multipart/form-data" \ --form "model=whisper-1" \ --form "file=@/path/to/audio.mp3"

For translation workflows, use the translation endpoint with the same model ID:

curl --request POST \ --url https://api.cometapi.com/v1/audio/translations \ --header "Authorization: Bearer YOUR_COMETAPI_KEY" \ --header "Content-Type: multipart/form-data" \ --form "model=whisper-1" \ --form "file=@/path/to/audio.mp3"

Step 3: Retrieve and Verify Results

Whisper-1

Technical Specifications of `whisper-1`

What is `whisper-1`?

Main features of `whisper-1`

How to access and integrate `whisper-1`

Step 2: Send Requests to `whisper-1` API

Step 3: Retrieve and Verify Results

Whisper-1 的功能

Whisper-1 的定价

Whisper-1 的示例代码与 API

更多模型

gpt-realtime-1.5

gpt-audio-1.5

TTS

Kling TTS

Kling video-to-audio

Kling video-to-audio

Kling text-to-audio

Kling text-to-audio

Whisper-1

Technical Specifications of `whisper-1`

What is `whisper-1`?

Main features of `whisper-1`

How to access and integrate `whisper-1`

Step 2: Send Requests to `whisper-1` API

Step 3: Retrieve and Verify Results

Whisper-1 的功能

Whisper-1 的定价

Whisper-1 的示例代码与 API

更多模型

gpt-realtime-1.5

gpt-audio-1.5

TTS

Kling TTS

Kling video-to-audio

Kling video-to-audio

Kling text-to-audio

Kling text-to-audio

Whisper-1

Technical Specifications of whisper-1

What is whisper-1?

Main features of whisper-1

How to access and integrate whisper-1

Step 1: Sign Up for API Key

Step 2: Send Requests to whisper-1 API

Step 3: Retrieve and Verify Results

更多模型

gpt-realtime-1.5

gpt-audio-1.5

TTS

Kling TTS

Kling video-to-audio

Kling video-to-audio

Kling text-to-audio

Kling text-to-audio

Whisper-1

Technical Specifications of whisper-1

What is whisper-1?

Main features of whisper-1

How to access and integrate whisper-1

Step 1: Sign Up for API Key

Step 2: Send Requests to whisper-1 API

Step 3: Retrieve and Verify Results

更多模型

gpt-realtime-1.5

gpt-audio-1.5

TTS

Kling TTS

Kling video-to-audio

Kling video-to-audio

Kling text-to-audio

Kling text-to-audio

Technical Specifications of `whisper-1`

What is `whisper-1`?

Main features of `whisper-1`

How to access and integrate `whisper-1`

Step 2: Send Requests to `whisper-1` API

Technical Specifications of `whisper-1`

What is `whisper-1`?

Main features of `whisper-1`

How to access and integrate `whisper-1`

Step 2: Send Requests to `whisper-1` API