Using Embeddings

This page covers the runtime API for working with embeddings — the generated endpoints, the search service, and practical usage patterns.

Embeddings API

When any feature has embeddings configured, the compiler generates these endpoints:

Generate Embedding

POST /api/v1/embeddings

Enqueue an embedding job for arbitrary content:

curl -X POST https://api.example.com/api/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Cookie: better-auth.session_token=..." \
  -d '{
    "content": "Text to generate an embedding for",
    "model": "@cf/baai/bge-base-en-v1.5",
    "table": "claims",
    "id": "clm_123"
  }'

Request body:

Field	Type	Required	Description
`content`	string	Yes	Text to embed
`model`	string	No	Embedding model (defaults to table config or `@cf/baai/bge-base-en-v1.5`)
`table`	string	No	Table to store the embedding in
`id`	string	Conditional	Record ID (required when `table` is provided)

Response:

{
  "queued": true,
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "table": "claims",
  "id": "clm_123",
  "model": "@cf/baai/bge-base-en-v1.5"
}

This endpoint is useful for:

Re-embedding existing records
Embedding content that doesn't go through CRUD routes
Batch embedding via scripts

List Embedding Tables

GET /api/v1/embeddings/tables

List tables that have embeddings configured:

{
  "tables": [
    {
      "name": "claims",
      "embeddingColumn": "embedding",
      "model": "@cf/baai/bge-base-en-v1.5"
    }
  ]
}

Search Service

For typed similarity search with classification, use the createEmbeddings() service generated from defineEmbedding() configurations.

Basic Search

const embeddings = createEmbeddings(ctx.env);

const results = await embeddings.claimSimilarity.search(
  "Police arrested three suspects",
  {
    storyId: "story_123",
    limit: 5,
    threshold: 0.70,
  }
);

// Returns: [{ id, score, classification, metadata }]
for (const match of results) {
  console.log(`${match.classification}: ${match.id} (score: ${match.score})`);
}

Classification

Results are automatically classified based on similarity score:

Classification	Default Threshold	Meaning
`DUPLICATE`	>= 0.90	Near-identical content
`CONFIRMS`	>= 0.85	Strongly supports same claim
`RELATED`	>= 0.75	Topically related
`NEW`	< 0.75	No significant match

Gray Zone Detection

For cases where automatic classification isn't sufficient, use gray zone detection:

const results = await embeddings.claimSimilarity.findWithGrayZone(
  "Some claim text",
  { min: 0.60, max: 0.85 }
);

// results.high_confidence — Score >= 0.85 (auto-classified)
// results.gray_zone — 0.60 <= score < 0.85 (needs review)

Raw Embedding

Generate an embedding vector without searching:

const vector = await embeddings.claimSimilarity.embed("Text to embed");
// Returns: number[] (768 dimensions for bge-base)

Vectorize Queries

If your project uses a Vectorize index, you can query it directly for custom search logic:

// Generate embedding for query text
const queryVector = await embeddings.claimSimilarity.embed(searchText);

// Query Vectorize with metadata filters
const results = await ctx.env.VECTORIZE.query(queryVector, {
  topK: 10,
  filter: {
    storyId: "story_123",
    organizationId: ctx.activeOrgId,
  },
});

The metadata fields in your defineTable embeddings config are automatically included in the Vectorize index, enabling filtered searches.

Auto-Embed vs Manual

Trigger	How	Use Case
Auto (INSERT)	Compiler enqueues after successful create	Default behavior
Auto (UPDATE)	Compiler enqueues when watched fields change	Keeps embeddings fresh
Manual (API)	`POST /api/v1/embeddings`	Re-embedding, batch jobs
Manual (Queue)	`env.EMBEDDINGS_QUEUE.send(...)`	Custom pipelines

Security

The embeddings API requires authentication
Jobs are scoped to the user's activeOrgId
Embedding jobs are only enqueued after all security checks pass (auth, firewall, guards)
The queue consumer is an internal process — it trusts pre-validated jobs

Supported Models

Model	Dimensions	Notes
`@cf/baai/bge-base-en-v1.5`	768	Default, good general-purpose
`@cf/baai/bge-small-en-v1.5`	384	Faster, smaller
`@cf/baai/bge-large-en-v1.5`	1024	Higher quality

See Workers AI Models for the full list.

Cloudflare Only

Embeddings require Cloudflare Workers AI and Queues. They are not available with the Bun or Node runtimes.