HarperFast · heskew · Jun 11, 2026 · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026
@@ -48,6 +48,14 @@ Prefer plain ASCII characters in Markdown unless a typographic character is genu
 - The `<VersionBadge>` component is globally registered — no import needed in `.md`/`.mdx` files.
 - See the complete repository organization in `CONTRIBUTING.md`
 
+## Versioning Content
+
+- Tag minor-version availability inline: `<VersionBadge version="vX.Y.0" />` for new surface, `<VersionBadge type="changed" version="vX.Y.0" />` for behavior changes to existing surface.
+- Derive the version from the core release the change ships in, stripping prerelease suffixes (`5.1.0-beta.1` → `v5.1.0`).
+- Each minor release gets a file under `release-notes/<major-codename>/` (e.g. `release-notes/v5-lincoln/5.1.md`); the sidebar picks it up automatically.
+- Absolute links from `release-notes/` (or `learn/`) into current reference docs use the versioned path `/reference/v5/...` — the reference plugin maps the current version to the `v5` URL path.
+- When documenting a change from a core/pro PR, cross-link the feature PR and the docs PR in both descriptions.
+
 ## Testing
 
 - There is no automated test suite. Verification is done by running the dev server or build.

@@ -227,6 +227,34 @@ If the field value is an array, each element in the array is individually indexe
 
 Null values are indexed by default (added in v4.3.0), enabling queries like `GET /Product/?category=null`.
 
+### `@embed`
+
+<VersionBadge version="v5.1.0" />
+
+Automatically computes an embedding vector for the attribute whenever the source field is written, using a configured [embedding model](../models/overview):
+
+```graphql
+type Document @table {
+	id: Long @primaryKey
+	text: String
+	embedding: [Float] @embed(source: "text", model: "default")
+}
+```
+
+- `source` — the name of the field to embed. Must be a declared field on the same type, passed as a string literal.
+- `model` — the logical name of a configured embedding model, passed as a string literal.
+
+The attribute type must be `[Float]`. The attribute is automatically indexed with an [HNSW vector index](#vector-indexing), so it is immediately searchable by similarity; an explicit `@indexed` on the same attribute is allowed only if it is also HNSW.
+
+Write semantics:
+
+- Creating a record with the source field, or updating the source field, computes the vector before the write commits (with `inputType: 'document'`). A failure to compute the embedding fails the write.
+- An update that does not touch the source field leaves the vector unchanged.
+- Setting the source field to `null` sets the vector to `null`.
+- Replicated writes and audit-log replays do not re-embed — the vector travels with the record, and only the node that accepted the original write calls the model.
+
+Multiple `@embed` attributes on one type are computed concurrently.
+
 ### `@createdTime`
 
 Automatically assigns a creation timestamp (Unix epoch milliseconds) to the attribute when a record is created.
@@ -393,6 +421,8 @@ type Document @table {
 }
 ```
 
+Embedding vectors can also be computed automatically at write time from a text field with the [`@embed` directive](#embed), which creates the HNSW index implicitly.
+
 Query by nearest neighbors using the `sort` parameter:
 
 ```javascript
@@ -443,26 +473,62 @@ let results = Document.search({
 
 `$distance` is available in both `sort`-based ranking and `conditions`-based threshold queries.
 
+### Per-Query Search Options
+
+The `sort` descriptor (and threshold condition) accepts options that tune an individual query:
+
+```javascript
+let results = Document.search({
+	sort: { attribute: 'textEmbeddings', target: searchVector, distance: 'dotProduct', ef: 200 },
+	limit: 5,
+});
+```
+
+- `distance` — overrides the index's distance function for this query: `"cosine"`, `"euclidean"`, or `"dotProduct"` (`dotProduct` <VersionBadge version="v5.1.0" />).
+- `ef` <VersionBadge version="v5.1.0" /> — overrides the search exploration budget for this query. Higher values improve recall at the cost of latency.
+
+<VersionBadge type="changed" version="v5.1.0" /> — When a query passes no `ef` and the index does not explicitly configure `efConstructionSearch` (or `efConstruction`), the search budget auto-scales with the size of the index, so recall holds as the table grows instead of decaying with a fixed budget.
+
 ### HNSW Parameters
 
-| Parameter              | Default           | Description                                                                                         |
-| ---------------------- | ----------------- | --------------------------------------------------------------------------------------------------- |
-| `distance`             | `"cosine"`        | Distance function: `"euclidean"` or `"cosine"` (negative cosine similarity)                         |
-| `efConstruction`       | `100`             | Max nodes explored during index construction. Higher = better recall, lower = better performance    |
-| `M`                    | `16`              | Preferred connections per graph layer. Higher = more space, better recall for high-dimensional data |
-| `optimizeRouting`      | `0.5`             | Heuristic aggressiveness for omitting redundant connections (0 = off, 1 = most aggressive)          |
-| `mL`                   | computed from `M` | Normalization factor for level generation                                                           |
-| `efSearchConstruction` | `50`              | Max nodes explored during search                                                                    |
+| Parameter              | Default           | Description                                                                                                                                              |
+| ---------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `distance`             | `"cosine"`        | Distance function: `"cosine"` (negative cosine similarity), `"euclidean"`, or `"dotProduct"` (added in v5.1.0)                                           |
+| `efConstruction`       | `100`             | Max nodes explored during index construction. Higher = better recall, lower = better performance                                                         |
+| `M`                    | `16`              | Preferred connections per graph layer. Higher = more space, better recall for high-dimensional data                                                      |
+| `optimizeRouting`      | `0.5`             | Heuristic aggressiveness for omitting redundant connections (0 = off, 1 = most aggressive)                                                               |
+| `mL`                   | computed from `M` | Normalization factor for level generation                                                                                                                |
+| `efConstructionSearch` | auto-scaled       | Max nodes explored during search. When unset, auto-scales with index size (see above); setting it (or `efConstruction`, which seeds it) fixes the budget |
+| `quantization`         | —                 | `"int8"` stores vectors quantized to int8 (added in v5.1.0, see below)                                                                                   |
 
 Example with custom parameters:
 
 ```graphql
 type Document @table {
 	id: Long @primaryKey
-	textEmbeddings: [Float] @indexed(type: "HNSW", distance: "euclidean", optimizeRouting: 0, efSearchConstruction: 100)
+	textEmbeddings: [Float] @indexed(type: "HNSW", distance: "euclidean", optimizeRouting: 0, efConstructionSearch: 100)
+}
+```
+
+Note: this parameter was previously documented as `efSearchConstruction`; the option name Harper reads is `efConstructionSearch`.
+
+<VersionBadge type="changed" version="v5.1.0" /> — Changing `efConstructionSearch` on an existing index no longer triggers a rebuild; it only affects searches. Structural parameters (`distance`, `M`, `efConstruction`, `quantization`) still rebuild the index when changed.
+
+### Vector Quantization
+
+<VersionBadge version="v5.1.0" />
+
+`quantization: "int8"` stores the index's vectors quantized to 8-bit integers, substantially reducing index size and memory traffic:
+
+```graphql
+type Document @table {
+	id: Long @primaryKey
+	textEmbeddings: [Float] @indexed(type: "HNSW", quantization: "int8")
 }
 ```
 
+Graph navigation runs on the quantized (approximate) distances. For nearest-neighbor `sort` queries, Harper re-ranks the results against the full-precision vectors stored on the records, restoring exact ordering and exact `$distance` values. Distance-threshold (`lt`/`le`) queries currently filter on the approximate distance.
+
 ## Field Types
 
 Harper supports the following field types:

@@ -0,0 +1,50 @@
+---
+id: analytics
+title: Analytics
+---
+
+<!-- Source: harper resources/models/analyticsTable.ts, resources/models/Models.ts (v5.1) -->
+
+<VersionBadge version="v5.1.0" />
+
+Every model call is recorded for observability and usage accounting, at two levels of granularity: a per-call log table for forensics, and aggregate counters in Harper's [general analytics](../analytics/overview) for dashboards and trends.
+
+## Per-call log: `hdb_model_calls`
+
+Each `embed()`, `generate()`, and `generateStream()` call writes one row to the `hdb_model_calls` system table — on success and on failure. With `toolMode: 'auto'`, each backend round inside the loop records its own row (the outer loop itself does not add one).
+
+| Field               | Description                                                                                                     |
+| ------------------- | --------------------------------------------------------------------------------------------------------------- |
+| `tenant`            | Tenant identifier, when the call carried one                                                                    |
+| `app`               | Resource path of the calling resource, when called from one                                                     |
+| `model`             | Logical model name the caller used                                                                              |
+| `backend`           | Backend that served the call (`ollama`, `openai`, …); `unknown` for pre-dispatch failures                       |
+| `method`            | `embed`, `generate`, or `generateStream`                                                                        |
+| `prompt_tokens`     | Prompt token count, when the backend reported usage                                                             |
+| `completion_tokens` | Completion token count, when the backend reported usage                                                         |
+| `embedding_tokens`  | Embedding token count, when the backend reported usage                                                          |
+| `latency_ms`        | Wall-clock call duration                                                                                        |
+| `success`           | Whether the call completed                                                                                      |
+| `error_code`        | On failure: `backend_error`, `aborted`, `capability_unsupported`, `backend_not_found`, or `pending_unsupported` |
+
+Rows are buffered in memory and flushed every 10 seconds, or immediately once 1,000 rows accumulate; rows older than 90 days are purged. Buffered rows may be lost on abrupt shutdown — treat the table as operational telemetry, not an audit log.
+
+Query it like any table, for example through the operations API:
+
+```json
+{
+	"operation": "search_by_conditions",
+	"database": "system",
+	"table": "hdb_model_calls",
+	"conditions": [{ "search_attribute": "success", "search_type": "equals", "search_value": false }]
+}
+```
+
+## Aggregate metrics
+
+Each call also increments Harper's aggregate analytics (visible in `hdb_raw_analytics` alongside the other [analytics metrics](../analytics/overview)):
+
+- `model-embed`, `model-generate`, `model-generateStream` — call counts
+- `model-embed-tokens`, `model-generate-tokens`, `model-generateStream-tokens` — token totals
+
+Metrics are broken down by backend name, so usage can be charted per provider.