Automatic Embeddings

Quickback can automatically generate embeddings for your data using Cloudflare Queues and Workers AI. When configured, INSERT and UPDATE operations automatically enqueue embedding jobs that are processed asynchronously.

Enabling Embeddings

Add an embeddings configuration to your resource definition:

// claims/resource.ts
import { defineResource } from "@quickback/core";
import { claims } from "./schema";

export default defineResource(claims, {
  firewall: { organization: {} },

  embeddings: {
    fields: ['content'],                // Fields to concatenate and embed
    model: '@cf/baai/bge-base-en-v1.5', // Embedding model (optional)
    onInsert: true,                     // Auto-embed on create (default: true)
    onUpdate: ['content'],              // Re-embed when these fields change
    embeddingColumn: 'embedding',       // Column to store embedding
    metadata: ['storyId'],              // Metadata for Vectorize index
  },

  crud: {
    create: { access: { roles: ['member'] } },
    update: { access: { roles: ['member'] } },
  },
});

Configuration Options

Option	Type	Default	Description
`fields`	`string[]`	Required	Fields to concatenate and embed
`model`	`string`	`'@cf/baai/bge-base-en-v1.5'`	Workers AI embedding model
`onInsert`	`boolean`	`true`	Embed on INSERT operations
`onUpdate`	`boolean \| string[]`	`true`	Embed on UPDATE; array limits to specific fields
`embeddingColumn`	`string`	`'embedding'`	Column to store the embedding vector
`separator`	`string`	`' '`	Separator for joining multiple fields
`metadata`	`string[]`	`[]`	Fields to include in Vectorize metadata

onUpdate Options

// Always re-embed on any update
onUpdate: true

// Never re-embed on update
onUpdate: false

// Only re-embed when specific fields change
onUpdate: ['content', 'title']

How It Works

┌─────────────────────────────────────────────────────────────────────┐
│                     Main API Worker                                 │
│                                                                     │
│  ┌─────────────────────┐    ┌────────────────────┐                 │
│  │ POST /claims        │    │ Queue Consumer     │                 │
│  │                     │    │                    │                 │
│  │ 1. Auth middleware  │    │ 1. Workers AI      │                 │
│  │ 2. Firewall         │    │    embed()         │                 │
│  │ 3. Guards           │    │ 2. D1 update()     │                 │
│  │ 4. Insert to D1     │    │ 3. Vectorize       │                 │
│  │ 5. Enqueue job ─────┼───▶│    upsert()        │                 │
│  └─────────────────────┘    └────────────────────┘                 │
│                                      ▲                              │
│                    ┌─────────────────┴──────────────────┐          │
│                    │      EMBEDDINGS_QUEUE              │          │
│                    └────────────────────────────────────┘          │
└─────────────────────────────────────────────────────────────────────┘

API Request - POST/PATCH arrives and passes through auth, firewall, guards
Database Insert - Record is created/updated in D1
Enqueue Job - Embedding job is sent to Cloudflare Queue
Queue Consumer - Processes job asynchronously:
- Calls Workers AI to generate embedding
- Updates D1 with embedding vector
- Optionally upserts to Vectorize index

Security Model

Security is enforced at enqueue time, not consume time:

Jobs are only enqueued after passing all security checks (auth, firewall, guards)
The queue consumer is an internal process that executes pre-validated jobs
If a user can't create a claim, they can't trigger an embedding job

Generic Embeddings API

In addition to automatic embeddings on CRUD operations, Quickback generates a generic embeddings API endpoint that allows you to trigger embeddings on arbitrary content.

POST /api/v1/embeddings

Generate an embedding for any text content:

curl -X POST https://your-api.workers.dev/api/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Cookie: better-auth.session_token=..." \
  -d '{
    "content": "Text to embed",
    "model": "@cf/baai/bge-base-en-v1.5",
    "table": "claims",
    "id": "clm_123"
  }'

Request Body

Field	Type	Required	Description
`content`	`string`	Yes	Text to generate embedding for
`model`	`string`	No	Embedding model (defaults to table config or `@cf/baai/bge-base-en-v1.5`)
`table`	`string`	No	Table to store embedding back to
`id`	`string`	Conditional	Record ID to update (required if `table` is specified)

Response

{
  "queued": true,
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "table": "claims",
  "id": "clm_123",
  "model": "@cf/baai/bge-base-en-v1.5"
}

GET /api/v1/embeddings/tables

List tables that have embeddings configured:

curl https://your-api.workers.dev/api/v1/embeddings/tables \
  -H "Cookie: better-auth.session_token=..."

Response:

{
  "tables": [
    {
      "name": "claims",
      "embeddingColumn": "embedding",
      "model": "@cf/baai/bge-base-en-v1.5"
    }
  ]
}

Authentication & Authorization

The generic embeddings API requires authentication and uses the activeOrgId from the user's context to enforce organization-level isolation. Embedding jobs are scoped to the user's current organization.

Use Cases

The generic embeddings API is useful for:

Batch embedding: Embed content without going through CRUD routes
Re-embedding: Force re-generation of embeddings for existing records
Preview embeddings: Test embedding generation before persisting
External content: Embed content that doesn't fit your defined schemas

Generated Files

When embeddings are configured, the compiler generates:

File	Purpose
`src/queue-consumer.ts`	Queue consumer handler for processing embedding jobs
`src/routes/embeddings.ts`	Generic embeddings API routes
`wrangler.toml`	Queue producer/consumer bindings, AI binding
`src/env.d.ts`	`EMBEDDINGS_QUEUE`, `AI` types
`src/index.ts`	Exports `queue` handler, mounts `/api/v1/embeddings`

wrangler.toml additions

# Embeddings Queue
[[queues.producers]]
queue = "your-app-embeddings-queue"
binding = "EMBEDDINGS_QUEUE"

[[queues.consumers]]
queue = "your-app-embeddings-queue"
max_batch_size = 10
max_batch_timeout = 30
max_retries = 3

# Workers AI
[ai]
binding = "AI"

Multiple Fields

Embed multiple fields by concatenating them:

embeddings: {
  fields: ['title', 'content', 'summary'],  // Joined with spaces by default
  // ...
}

Generated embedding text: "${title} ${content} ${summary}"

Custom Separator

Use separator to control how fields are joined. This is useful for sentence boundary detection:

embeddings: {
  fields: ['title', 'summary'],
  separator: '. ',  // Join with period + space
  // ...
}

Generated code: [result[0].title, result[0].summary].filter(Boolean).join('. ')

The filter(Boolean) ensures null or empty fields are excluded cleanly — no trailing separator when a field is absent.

Vectorize Integration

If you have a Vectorize index configured, embeddings are automatically upserted:

// quickback.config.ts
providers: {
  database: {
    config: {
      vectorizeIndexName: 'claims-embeddings',  // Your Vectorize index
      vectorizeBinding: 'VECTORIZE',
    },
  },
},

The queue consumer will:

Generate the embedding via Workers AI
Store the vector in D1 (JSON string)
Upsert to Vectorize with metadata

Vectorize Metadata

Include fields in Vectorize metadata for filtering:

embeddings: {
  fields: ['content'],
  metadata: ['storyId', 'organizationId', 'claimType'],
}

Enables queries like:

const results = await env.VECTORIZE.query(vector, {
  topK: 10,
  filter: { storyId: 'story_123' }
});

Schema Requirements

Your schema must include the embedding column:

// claims/schema.ts
import { sqliteTable, text } from "drizzle-orm/sqlite-core";

export const claims = sqliteTable("claims", {
  id: text("id").primaryKey(),
  content: text("content").notNull(),
  storyId: text("story_id"),
  organizationId: text("organization_id").notNull(),

  // Embedding column - stores JSON array of floats
  embedding: text("embedding"),
});

Supported Models

Any Workers AI embedding model can be used:

Model	Dimensions	Notes
`@cf/baai/bge-base-en-v1.5`	768	Default, good general-purpose
`@cf/baai/bge-small-en-v1.5`	384	Faster, smaller
`@cf/baai/bge-large-en-v1.5`	1024	Higher quality

See Workers AI Models for the full list.

Deployment

After compiling with embeddings:

Create the queue:

wrangler queues create your-app-embeddings-queue

Deploy the worker:
```
wrangler deploy
```

The single worker handles both HTTP requests and queue consumption.

Monitoring

Queue metrics are available in the Cloudflare dashboard:

Messages enqueued
Messages processed
Retry count
Consumer lag

Check queue health:

wrangler queues list
wrangler queues consumer your-app-embeddings-queue

Error Handling

Failed jobs are automatically retried (up to max_retries):

// In queue consumer
try {
  const embedding = await env.AI.run(job.model, { text: job.content });
  // ...
  message.ack();
} catch (error) {
  console.error('[Queue] Embedding job failed:', error);
  message.retry();  // Will retry up to 3 times
}

After max retries, the message is dead-lettered (if configured) or dropped.

Similarity Search Service

For applications that need typed similarity search with classification, you can define embedding search configurations using defineEmbedding. This generates a service layer with typed search functions.

Defining Search Configurations

Create a file in services/embeddings/:

// services/embeddings/claim-similarity.ts
import { defineEmbedding } from '@quickback/compiler';

export default defineEmbedding({
  name: 'claim-similarity',
  description: 'Find similar claims with classification',

  // Source configuration
  source: 'claims',              // Table name
  vectorIndex: 'VECTORIZE',      // Binding name
  model: '@cf/baai/bge-base-en-v1.5',

  // Search configuration
  search: {
    threshold: 0.60,             // Minimum similarity (default: 0.60)
    limit: 10,                   // Max results (default: 10)
    classify: {
      DUPLICATE: 0.90,           // Score >= 0.90 = DUPLICATE
      CONFIRMS: 0.85,            // Score >= 0.85 = CONFIRMS
      RELATED: 0.75,             // Score >= 0.75 = RELATED
    },
    filters: ['storyId', 'organizationId'],  // Filterable fields
  },

  // Generation triggers (beyond CRUD)
  triggers: {
    onQueueMessage: 'embed_claim',  // Listen for queue messages
  },
});

Generated Service Layer

After compilation, a createEmbeddings() helper is generated in src/lib/embeddings.ts:

import { createEmbeddings } from '../lib/embeddings';

export const execute: ActionExecutor = async ({ ctx, input }) => {
  const embeddings = createEmbeddings(ctx.env);

  // Search with automatic classification
  const similar = await embeddings.claimSimilarity.search(
    'Police arrested three suspects in downtown robbery',
    {
      storyId: 'story_123',
      limit: 5,
      threshold: 0.70,
    }
  );

  // Returns: [{ id, score: 0.87, classification: 'CONFIRMS', metadata }]
  for (const match of similar) {
    console.log(`${match.classification}: ${match.id} (${match.score})`);
  }

  return { similar };
};

Classification Thresholds

Results are automatically classified based on similarity score:

Classification	Default Threshold	Meaning
`DUPLICATE`	>= 0.90	Near-identical content
`CONFIRMS`	>= 0.85	Strongly supports same claim
`RELATED`	>= 0.75	Topically related
`NEW`	< 0.75	No significant match

Customize thresholds per use case:

search: {
  classify: {
    DUPLICATE: 0.95,  // Stricter duplicate detection
    CONFIRMS: 0.88,
    RELATED: 0.70,    // Broader "related" category
  },
}

Gray Zone Detection

For cases where automatic classification isn't sufficient, use gray zone detection to get matches that need semantic evaluation:

const results = await embeddings.claimSimilarity.findWithGrayZone(
  'Some claim text',
  { min: 0.60, max: 0.85 }
);

// Returns structured results:
// {
//   high_confidence: [...],  // Score >= 0.85 (auto-classified)
//   gray_zone: [...]         // 0.60 <= score < 0.85 (needs review)
// }

// Process high confidence matches automatically
for (const match of results.high_confidence) {
  await markAsDuplicate(match.id);
}

// Queue gray zone for semantic review
for (const match of results.gray_zone) {
  await queueForReview(match.id, match.score);
}

Generate Embeddings Directly

Generate embeddings without searching:

const embeddings = createEmbeddings(ctx.env);

// Get raw embedding vector
const vector = await embeddings.claimSimilarity.embed(
  'Text to embed'
);
// Returns: number[] (768 dimensions for bge-base)

Multiple Search Configurations

Define different configurations for different use cases:

// services/embeddings/story-similarity.ts
export default defineEmbedding({
  name: 'story-similarity',
  source: 'stories',
  search: {
    threshold: 0.65,
    limit: 20,
    classify: {
      DUPLICATE: 0.92,
      CONFIRMS: 0.80,
      RELATED: 0.65,
    },
    filters: ['streamId'],
  },
});

// services/embeddings/material-similarity.ts
export default defineEmbedding({
  name: 'material-similarity',
  source: 'materials',
  search: {
    threshold: 0.70,
    limit: 5,
    classify: {
      DUPLICATE: 0.95,
      CONFIRMS: 0.90,
      RELATED: 0.80,
    },
    filters: ['sourceType', 'organizationId'],
  },
});

Usage:

const embeddings = createEmbeddings(ctx.env);

// Different search behaviors for different content types
const similarClaims = await embeddings.claimSimilarity.search(text, opts);
const similarStories = await embeddings.storySimilarity.search(text, opts);
const similarMaterials = await embeddings.materialSimilarity.search(text, opts);

Table-Level vs Service-Level

Feature	Table-level (`defineTable`)	Service-level (`defineEmbedding`)
Auto-embed on INSERT	✅	❌
Auto-embed on UPDATE	✅	❌
Custom search functions	❌	✅
Classification thresholds	❌	✅
Gray zone detection	❌	✅
Filterable searches	❌	✅
Queue message triggers	❌	✅

Use both together:

defineTable with embeddings config for automatic embedding generation
defineEmbedding for typed search functions with classification

On this page