Quickback Docs

Automatic Embeddings

Quickback can automatically generate embeddings for your data using Cloudflare Queues and Workers AI. When configured, INSERT and UPDATE operations automatically enqueue embedding jobs that are processed asynchronously.

Enabling Embeddings

Add an embeddings configuration to your resource definition:

// claims/resource.ts
import { defineResource } from "@quickback/core";
import { claims } from "./schema";

export default defineResource(claims, {
  firewall: { organization: {} },

  embeddings: {
    fields: ['content'],                // Fields to concatenate and embed
    model: '@cf/baai/bge-base-en-v1.5', // Embedding model (optional)
    onInsert: true,                     // Auto-embed on create (default: true)
    onUpdate: ['content'],              // Re-embed when these fields change
    embeddingColumn: 'embedding',       // Column to store embedding
    metadata: ['storyId'],              // Metadata for Vectorize index
  },

  crud: {
    create: { access: { roles: ['member'] } },
    update: { access: { roles: ['member'] } },
  },
});

Configuration Options

OptionTypeDefaultDescription
fieldsstring[]RequiredFields to concatenate and embed
modelstring'@cf/baai/bge-base-en-v1.5'Workers AI embedding model
onInsertbooleantrueEmbed on INSERT operations
onUpdateboolean | string[]trueEmbed on UPDATE; array limits to specific fields
embeddingColumnstring'embedding'Column to store the embedding vector
separatorstring' 'Separator for joining multiple fields
metadatastring[][]Fields to include in Vectorize metadata

onUpdate Options

// Always re-embed on any update
onUpdate: true

// Never re-embed on update
onUpdate: false

// Only re-embed when specific fields change
onUpdate: ['content', 'title']

How It Works

┌─────────────────────────────────────────────────────────────────────┐
│                     Main API Worker                                 │
│                                                                     │
│  ┌─────────────────────┐    ┌────────────────────┐                 │
│  │ POST /claims        │    │ Queue Consumer     │                 │
│  │                     │    │                    │                 │
│  │ 1. Auth middleware  │    │ 1. Workers AI      │                 │
│  │ 2. Firewall         │    │    embed()         │                 │
│  │ 3. Guards           │    │ 2. D1 update()     │                 │
│  │ 4. Insert to D1     │    │ 3. Vectorize       │                 │
│  │ 5. Enqueue job ─────┼───▶│    upsert()        │                 │
│  └─────────────────────┘    └────────────────────┘                 │
│                                      ▲                              │
│                    ┌─────────────────┴──────────────────┐          │
│                    │      EMBEDDINGS_QUEUE              │          │
│                    └────────────────────────────────────┘          │
└─────────────────────────────────────────────────────────────────────┘
  1. API Request - POST/PATCH arrives and passes through auth, firewall, guards
  2. Database Insert - Record is created/updated in D1
  3. Enqueue Job - Embedding job is sent to Cloudflare Queue
  4. Queue Consumer - Processes job asynchronously:
    • Calls Workers AI to generate embedding
    • Updates D1 with embedding vector
    • Optionally upserts to Vectorize index

Security Model

Security is enforced at enqueue time, not consume time:

  • Jobs are only enqueued after passing all security checks (auth, firewall, guards)
  • The queue consumer is an internal process that executes pre-validated jobs
  • If a user can't create a claim, they can't trigger an embedding job

Generic Embeddings API

In addition to automatic embeddings on CRUD operations, Quickback generates a generic embeddings API endpoint that allows you to trigger embeddings on arbitrary content.

POST /api/v1/embeddings

Generate an embedding for any text content:

curl -X POST https://your-api.workers.dev/api/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Cookie: better-auth.session_token=..." \
  -d '{
    "content": "Text to embed",
    "model": "@cf/baai/bge-base-en-v1.5",
    "table": "claims",
    "id": "clm_123"
  }'

Request Body

FieldTypeRequiredDescription
contentstringYesText to generate embedding for
modelstringNoEmbedding model (defaults to table config or @cf/baai/bge-base-en-v1.5)
tablestringNoTable to store embedding back to
idstringConditionalRecord ID to update (required if table is specified)

Response

{
  "queued": true,
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "table": "claims",
  "id": "clm_123",
  "model": "@cf/baai/bge-base-en-v1.5"
}

GET /api/v1/embeddings/tables

List tables that have embeddings configured:

curl https://your-api.workers.dev/api/v1/embeddings/tables \
  -H "Cookie: better-auth.session_token=..."

Response:

{
  "tables": [
    {
      "name": "claims",
      "embeddingColumn": "embedding",
      "model": "@cf/baai/bge-base-en-v1.5"
    }
  ]
}

Authentication & Authorization

The generic embeddings API requires authentication and uses the activeOrgId from the user's context to enforce organization-level isolation. Embedding jobs are scoped to the user's current organization.

Use Cases

The generic embeddings API is useful for:

  • Batch embedding: Embed content without going through CRUD routes
  • Re-embedding: Force re-generation of embeddings for existing records
  • Preview embeddings: Test embedding generation before persisting
  • External content: Embed content that doesn't fit your defined schemas

Generated Files

When embeddings are configured, the compiler generates:

FilePurpose
src/queue-consumer.tsQueue consumer handler for processing embedding jobs
src/routes/embeddings.tsGeneric embeddings API routes
wrangler.tomlQueue producer/consumer bindings, AI binding
src/env.d.tsEMBEDDINGS_QUEUE, AI types
src/index.tsExports queue handler, mounts /api/v1/embeddings

wrangler.toml additions

# Embeddings Queue
[[queues.producers]]
queue = "your-app-embeddings-queue"
binding = "EMBEDDINGS_QUEUE"

[[queues.consumers]]
queue = "your-app-embeddings-queue"
max_batch_size = 10
max_batch_timeout = 30
max_retries = 3

# Workers AI
[ai]
binding = "AI"

Multiple Fields

Embed multiple fields by concatenating them:

embeddings: {
  fields: ['title', 'content', 'summary'],  // Joined with spaces by default
  // ...
}

Generated embedding text: "${title} ${content} ${summary}"

Custom Separator

Use separator to control how fields are joined. This is useful for sentence boundary detection:

embeddings: {
  fields: ['title', 'summary'],
  separator: '. ',  // Join with period + space
  // ...
}

Generated code: [result[0].title, result[0].summary].filter(Boolean).join('. ')

The filter(Boolean) ensures null or empty fields are excluded cleanly — no trailing separator when a field is absent.

Vectorize Integration

If you have a Vectorize index configured, embeddings are automatically upserted:

// quickback.config.ts
providers: {
  database: {
    config: {
      vectorizeIndexName: 'claims-embeddings',  // Your Vectorize index
      vectorizeBinding: 'VECTORIZE',
    },
  },
},

The queue consumer will:

  1. Generate the embedding via Workers AI
  2. Store the vector in D1 (JSON string)
  3. Upsert to Vectorize with metadata

Vectorize Metadata

Include fields in Vectorize metadata for filtering:

embeddings: {
  fields: ['content'],
  metadata: ['storyId', 'organizationId', 'claimType'],
}

Enables queries like:

const results = await env.VECTORIZE.query(vector, {
  topK: 10,
  filter: { storyId: 'story_123' }
});

Schema Requirements

Your schema must include the embedding column:

// claims/schema.ts
import { sqliteTable, text } from "drizzle-orm/sqlite-core";

export const claims = sqliteTable("claims", {
  id: text("id").primaryKey(),
  content: text("content").notNull(),
  storyId: text("story_id"),
  organizationId: text("organization_id").notNull(),

  // Embedding column - stores JSON array of floats
  embedding: text("embedding"),
});

Supported Models

Any Workers AI embedding model can be used:

ModelDimensionsNotes
@cf/baai/bge-base-en-v1.5768Default, good general-purpose
@cf/baai/bge-small-en-v1.5384Faster, smaller
@cf/baai/bge-large-en-v1.51024Higher quality

See Workers AI Models for the full list.

Deployment

After compiling with embeddings:

  1. Create the queue:

    wrangler queues create your-app-embeddings-queue
  2. Deploy the worker:

    wrangler deploy

The single worker handles both HTTP requests and queue consumption.

Monitoring

Queue metrics are available in the Cloudflare dashboard:

  • Messages enqueued
  • Messages processed
  • Retry count
  • Consumer lag

Check queue health:

wrangler queues list
wrangler queues consumer your-app-embeddings-queue

Error Handling

Failed jobs are automatically retried (up to max_retries):

// In queue consumer
try {
  const embedding = await env.AI.run(job.model, { text: job.content });
  // ...
  message.ack();
} catch (error) {
  console.error('[Queue] Embedding job failed:', error);
  message.retry();  // Will retry up to 3 times
}

After max retries, the message is dead-lettered (if configured) or dropped.

Similarity Search Service

For applications that need typed similarity search with classification, you can define embedding search configurations using defineEmbedding. This generates a service layer with typed search functions.

Defining Search Configurations

Create a file in services/embeddings/:

// services/embeddings/claim-similarity.ts
import { defineEmbedding } from '@quickback/compiler';

export default defineEmbedding({
  name: 'claim-similarity',
  description: 'Find similar claims with classification',

  // Source configuration
  source: 'claims',              // Table name
  vectorIndex: 'VECTORIZE',      // Binding name
  model: '@cf/baai/bge-base-en-v1.5',

  // Search configuration
  search: {
    threshold: 0.60,             // Minimum similarity (default: 0.60)
    limit: 10,                   // Max results (default: 10)
    classify: {
      DUPLICATE: 0.90,           // Score >= 0.90 = DUPLICATE
      CONFIRMS: 0.85,            // Score >= 0.85 = CONFIRMS
      RELATED: 0.75,             // Score >= 0.75 = RELATED
    },
    filters: ['storyId', 'organizationId'],  // Filterable fields
  },

  // Generation triggers (beyond CRUD)
  triggers: {
    onQueueMessage: 'embed_claim',  // Listen for queue messages
  },
});

Generated Service Layer

After compilation, a createEmbeddings() helper is generated in src/lib/embeddings.ts:

import { createEmbeddings } from '../lib/embeddings';

export const execute: ActionExecutor = async ({ ctx, input }) => {
  const embeddings = createEmbeddings(ctx.env);

  // Search with automatic classification
  const similar = await embeddings.claimSimilarity.search(
    'Police arrested three suspects in downtown robbery',
    {
      storyId: 'story_123',
      limit: 5,
      threshold: 0.70,
    }
  );

  // Returns: [{ id, score: 0.87, classification: 'CONFIRMS', metadata }]
  for (const match of similar) {
    console.log(`${match.classification}: ${match.id} (${match.score})`);
  }

  return { similar };
};

Classification Thresholds

Results are automatically classified based on similarity score:

ClassificationDefault ThresholdMeaning
DUPLICATE>= 0.90Near-identical content
CONFIRMS>= 0.85Strongly supports same claim
RELATED>= 0.75Topically related
NEW< 0.75No significant match

Customize thresholds per use case:

search: {
  classify: {
    DUPLICATE: 0.95,  // Stricter duplicate detection
    CONFIRMS: 0.88,
    RELATED: 0.70,    // Broader "related" category
  },
}

Gray Zone Detection

For cases where automatic classification isn't sufficient, use gray zone detection to get matches that need semantic evaluation:

const results = await embeddings.claimSimilarity.findWithGrayZone(
  'Some claim text',
  { min: 0.60, max: 0.85 }
);

// Returns structured results:
// {
//   high_confidence: [...],  // Score >= 0.85 (auto-classified)
//   gray_zone: [...]         // 0.60 <= score < 0.85 (needs review)
// }

// Process high confidence matches automatically
for (const match of results.high_confidence) {
  await markAsDuplicate(match.id);
}

// Queue gray zone for semantic review
for (const match of results.gray_zone) {
  await queueForReview(match.id, match.score);
}

Generate Embeddings Directly

Generate embeddings without searching:

const embeddings = createEmbeddings(ctx.env);

// Get raw embedding vector
const vector = await embeddings.claimSimilarity.embed(
  'Text to embed'
);
// Returns: number[] (768 dimensions for bge-base)

Multiple Search Configurations

Define different configurations for different use cases:

// services/embeddings/story-similarity.ts
export default defineEmbedding({
  name: 'story-similarity',
  source: 'stories',
  search: {
    threshold: 0.65,
    limit: 20,
    classify: {
      DUPLICATE: 0.92,
      CONFIRMS: 0.80,
      RELATED: 0.65,
    },
    filters: ['streamId'],
  },
});

// services/embeddings/material-similarity.ts
export default defineEmbedding({
  name: 'material-similarity',
  source: 'materials',
  search: {
    threshold: 0.70,
    limit: 5,
    classify: {
      DUPLICATE: 0.95,
      CONFIRMS: 0.90,
      RELATED: 0.80,
    },
    filters: ['sourceType', 'organizationId'],
  },
});

Usage:

const embeddings = createEmbeddings(ctx.env);

// Different search behaviors for different content types
const similarClaims = await embeddings.claimSimilarity.search(text, opts);
const similarStories = await embeddings.storySimilarity.search(text, opts);
const similarMaterials = await embeddings.materialSimilarity.search(text, opts);

Table-Level vs Service-Level

FeatureTable-level (defineTable)Service-level (defineEmbedding)
Auto-embed on INSERT
Auto-embed on UPDATE
Custom search functions
Classification thresholds
Gray zone detection
Filterable searches
Queue message triggers

Use both together:

  • defineTable with embeddings config for automatic embedding generation
  • defineEmbedding for typed search functions with classification

On this page