O

text-embedding-3-large

Енгізу:$0.104/M
Шығыс:$0.104/M
A large text embedding model for a wide range of natural language processing tasks.
Жаңа
Коммерциялық пайдалану

Technical Specifications of text-embedding-3-large

SpecificationDetails
Model IDtext-embedding-3-large
Model TypeText embedding model
Primary FunctionConverts text into dense numerical vectors for semantic search, clustering, classification, retrieval, recommendation, and similarity analysis
Embedding SizeUp to 3072 dimensions by default, with support for shortening via the dimensions parameter
Input FormatString or array of strings/token arrays for batch embedding requests
Maximum Input LengthUp to 8192 tokens per input; total tokens across inputs in one request can be up to 300,000 tokens
Output FormatEmbedding vectors returned as float by default, with base64 also supported via encoding_format
API Endpoint CompatibilityEmbeddings API-compatible workflows
Common Use CasesSemantic search, retrieval-augmented generation, deduplication, recommendation systems, document ranking, topic grouping, and text similarity

What is text-embedding-3-large?

text-embedding-3-large is a large text embedding model designed to transform natural language into high-dimensional vector representations that preserve semantic meaning. It is well suited for applications where measuring similarity between pieces of text is important, such as search, recommendation, clustering, classification, and retrieval pipelines. Its larger embedding size makes it useful for teams that need stronger semantic representation quality across a wide range of natural language processing tasks.

Unlike generative models that produce text, text-embedding-3-large specializes in encoding text into vectors that downstream systems can compare mathematically. These embeddings can then be stored in vector databases, used in ranking systems, or supplied to analytics and machine learning workflows for more accurate text understanding.

Main features of text-embedding-3-large

  • High-dimensional semantic embeddings: Produces rich vector representations, with 3072 dimensions by default, for nuanced understanding of meaning and similarity across texts.
  • Flexible dimensionality control: Supports the dimensions parameter, allowing developers to reduce vector size when optimizing for storage, latency, or infrastructure cost.
  • Batch input support: Accepts single strings or arrays of inputs, making it practical for indexing documents, knowledge bases, and large-scale corpora efficiently.
  • Multiple encoding formats: Returns embeddings in float format by default and can also provide base64, depending on integration needs.
  • Wide NLP applicability: Can be used for semantic search, clustering, ranking, recommendation, duplicate detection, and retrieval-augmented systems built on vector similarity.
  • Longer input handling: Supports inputs up to 8192 tokens per item, which is useful for embedding larger passages and structured text segments.

How to access and integrate text-embedding-3-large

Step 1: Sign Up for API Key

To get started, first register on CometAPI and generate your API key from the dashboard. This key is required to authenticate all requests and connect your application to the text-embedding-3-large model.

Step 2: Send Requests to text-embedding-3-large API

Once you have your API key, send a request to the embeddings-compatible API endpoint using text-embedding-3-large as the model name. Include the text you want to convert into embeddings in the request body.

curl https://api.cometapi.com/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $COMETAPI_API_KEY" \
  -d '{
    "model": "text-embedding-3-large",
    "input": "The quick brown fox jumped over the lazy dog"
  }'

Step 3: Retrieve and Verify Results

After the request is processed, the API returns a structured response containing the embedding vector data, model identifier, and token usage. Verify that the model field is text-embedding-3-large, confirm the embedding payload is present, and then store or forward the vector for use in search, ranking, clustering, or retrieval workflows.

Көбірек модельдер