Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,14 @@ Prefer plain ASCII characters in Markdown unless a typographic character is genu
- The `<VersionBadge>` component is globally registered — no import needed in `.md`/`.mdx` files.
- See the complete repository organization in `CONTRIBUTING.md`

## Versioning Content

- Tag minor-version availability inline: `<VersionBadge version="vX.Y.0" />` for new surface, `<VersionBadge type="changed" version="vX.Y.0" />` for behavior changes to existing surface.
- Derive the version from the core release the change ships in, stripping prerelease suffixes (`5.1.0-beta.1` → `v5.1.0`).
- Each minor release gets a file under `release-notes/<major-codename>/` (e.g. `release-notes/v5-lincoln/5.1.md`); the sidebar picks it up automatically.
- Absolute links from `release-notes/` (or `learn/`) into current reference docs use the versioned path `/reference/v5/...` — the reference plugin maps the current version to the `v5` URL path.
- When documenting a change from a core/pro PR, cross-link the feature PR and the docs PR in both descriptions.

## Testing

- There is no automated test suite. Verification is done by running the dev server or build.
Expand Down
84 changes: 75 additions & 9 deletions reference/database/schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,34 @@ If the field value is an array, each element in the array is individually indexe

Null values are indexed by default (added in v4.3.0), enabling queries like `GET /Product/?category=null`.

### `@embed`

<VersionBadge version="v5.1.0" />

Automatically computes an embedding vector for the attribute whenever the source field is written, using a configured [embedding model](../models/overview):

```graphql
type Document @table {
id: Long @primaryKey
text: String
embedding: [Float] @embed(source: "text", model: "default")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an equivalent to automatically embed on the search side of the equation? When we Document.search({. I'm not seeing an equivalent in the docs, but maybe I missed it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry Dawson. Missed this before merging. No, there's not but the need is now tracked at HarperFast/harper#1277. Great callout.

}
```

- `source` — the name of the field to embed. Must be a declared field on the same type, passed as a string literal.
- `model` — the logical name of a configured embedding model, passed as a string literal.

The attribute type must be `[Float]`. The attribute is automatically indexed with an [HNSW vector index](#vector-indexing), so it is immediately searchable by similarity; an explicit `@indexed` on the same attribute is allowed only if it is also HNSW.

Write semantics:

- Creating a record with the source field, or updating the source field, computes the vector before the write commits (with `inputType: 'document'`). A failure to compute the embedding fails the write.
- An update that does not touch the source field leaves the vector unchanged.
- Setting the source field to `null` sets the vector to `null`.
- Replicated writes and audit-log replays do not re-embed — the vector travels with the record, and only the node that accepted the original write calls the model.

Multiple `@embed` attributes on one type are computed concurrently.

### `@createdTime`

Automatically assigns a creation timestamp (Unix epoch milliseconds) to the attribute when a record is created.
Expand Down Expand Up @@ -393,6 +421,8 @@ type Document @table {
}
```

Embedding vectors can also be computed automatically at write time from a text field with the [`@embed` directive](#embed), which creates the HNSW index implicitly.

Query by nearest neighbors using the `sort` parameter:

```javascript
Expand Down Expand Up @@ -443,26 +473,62 @@ let results = Document.search({

`$distance` is available in both `sort`-based ranking and `conditions`-based threshold queries.

### Per-Query Search Options

The `sort` descriptor (and threshold condition) accepts options that tune an individual query:

```javascript
let results = Document.search({
sort: { attribute: 'textEmbeddings', target: searchVector, distance: 'dotProduct', ef: 200 },
limit: 5,
});
```

- `distance` — overrides the index's distance function for this query: `"cosine"`, `"euclidean"`, or `"dotProduct"` (`dotProduct` <VersionBadge version="v5.1.0" />).
- `ef` <VersionBadge version="v5.1.0" /> — overrides the search exploration budget for this query. Higher values improve recall at the cost of latency.

<VersionBadge type="changed" version="v5.1.0" /> — When a query passes no `ef` and the index does not explicitly configure `efConstructionSearch` (or `efConstruction`), the search budget auto-scales with the size of the index, so recall holds as the table grows instead of decaying with a fixed budget.

### HNSW Parameters

| Parameter | Default | Description |
| ---------------------- | ----------------- | --------------------------------------------------------------------------------------------------- |
| `distance` | `"cosine"` | Distance function: `"euclidean"` or `"cosine"` (negative cosine similarity) |
| `efConstruction` | `100` | Max nodes explored during index construction. Higher = better recall, lower = better performance |
| `M` | `16` | Preferred connections per graph layer. Higher = more space, better recall for high-dimensional data |
| `optimizeRouting` | `0.5` | Heuristic aggressiveness for omitting redundant connections (0 = off, 1 = most aggressive) |
| `mL` | computed from `M` | Normalization factor for level generation |
| `efSearchConstruction` | `50` | Max nodes explored during search |
| Parameter | Default | Description |
| ---------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `distance` | `"cosine"` | Distance function: `"cosine"` (negative cosine similarity), `"euclidean"`, or `"dotProduct"` (added in v5.1.0) |
| `efConstruction` | `100` | Max nodes explored during index construction. Higher = better recall, lower = better performance |
| `M` | `16` | Preferred connections per graph layer. Higher = more space, better recall for high-dimensional data |
| `optimizeRouting` | `0.5` | Heuristic aggressiveness for omitting redundant connections (0 = off, 1 = most aggressive) |
| `mL` | computed from `M` | Normalization factor for level generation |
| `efConstructionSearch` | auto-scaled | Max nodes explored during search. When unset, auto-scales with index size (see above); setting it (or `efConstruction`, which seeds it) fixes the budget |
| `quantization` | — | `"int8"` stores vectors quantized to int8 (added in v5.1.0, see below) |

Example with custom parameters:

```graphql
type Document @table {
id: Long @primaryKey
textEmbeddings: [Float] @indexed(type: "HNSW", distance: "euclidean", optimizeRouting: 0, efSearchConstruction: 100)
textEmbeddings: [Float] @indexed(type: "HNSW", distance: "euclidean", optimizeRouting: 0, efConstructionSearch: 100)
}
```

Note: this parameter was previously documented as `efSearchConstruction`; the option name Harper reads is `efConstructionSearch`.

<VersionBadge type="changed" version="v5.1.0" /> — Changing `efConstructionSearch` on an existing index no longer triggers a rebuild; it only affects searches. Structural parameters (`distance`, `M`, `efConstruction`, `quantization`) still rebuild the index when changed.

### Vector Quantization

<VersionBadge version="v5.1.0" />

`quantization: "int8"` stores the index's vectors quantized to 8-bit integers, substantially reducing index size and memory traffic:

```graphql
type Document @table {
id: Long @primaryKey
textEmbeddings: [Float] @indexed(type: "HNSW", quantization: "int8")
}
```

Graph navigation runs on the quantized (approximate) distances. For nearest-neighbor `sort` queries, Harper re-ranks the results against the full-precision vectors stored on the records, restoring exact ordering and exact `$distance` values. Distance-threshold (`lt`/`le`) queries currently filter on the approximate distance.

## Field Types

Harper supports the following field types:
Expand Down
50 changes: 50 additions & 0 deletions reference/models/analytics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
id: analytics
title: Analytics
---

<!-- Source: harper resources/models/analyticsTable.ts, resources/models/Models.ts (v5.1) -->

<VersionBadge version="v5.1.0" />

Every model call is recorded for observability and usage accounting, at two levels of granularity: a per-call log table for forensics, and aggregate counters in Harper's [general analytics](../analytics/overview) for dashboards and trends.

## Per-call log: `hdb_model_calls`

Each `embed()`, `generate()`, and `generateStream()` call writes one row to the `hdb_model_calls` system table — on success and on failure. With `toolMode: 'auto'`, each backend round inside the loop records its own row (the outer loop itself does not add one).

| Field | Description |
| ------------------- | --------------------------------------------------------------------------------------------------------------- |
| `tenant` | Tenant identifier, when the call carried one |
| `app` | Resource path of the calling resource, when called from one |
| `model` | Logical model name the caller used |
| `backend` | Backend that served the call (`ollama`, `openai`, …); `unknown` for pre-dispatch failures |
| `method` | `embed`, `generate`, or `generateStream` |
| `prompt_tokens` | Prompt token count, when the backend reported usage |
| `completion_tokens` | Completion token count, when the backend reported usage |
| `embedding_tokens` | Embedding token count, when the backend reported usage |
| `latency_ms` | Wall-clock call duration |
| `success` | Whether the call completed |
| `error_code` | On failure: `backend_error`, `aborted`, `capability_unsupported`, `backend_not_found`, or `pending_unsupported` |

Rows are buffered in memory and flushed every 10 seconds, or immediately once 1,000 rows accumulate; rows older than 90 days are purged. Buffered rows may be lost on abrupt shutdown — treat the table as operational telemetry, not an audit log.

Query it like any table, for example through the operations API:

```json
{
"operation": "search_by_conditions",
"database": "system",
"table": "hdb_model_calls",
"conditions": [{ "search_attribute": "success", "search_type": "equals", "search_value": false }]
}
```

## Aggregate metrics

Each call also increments Harper's aggregate analytics (visible in `hdb_raw_analytics` alongside the other [analytics metrics](../analytics/overview)):

- `model-embed`, `model-generate`, `model-generateStream` — call counts
- `model-embed-tokens`, `model-generate-tokens`, `model-generateStream-tokens` — token totals

Metrics are broken down by backend name, so usage can be charted per provider.
Loading