Skip to content

Hindsight

Hindsight is the vector-memory system running on vanessa. It is assembled from four cooperating pieces plus a Cloudflare Worker:

  • @luxus/hindsight-api — the Hindsight server (containerized via Podman), port 8888.
  • @luxus/postgres — PostgreSQL with pgvector, port 5432, holding the hindsight database.
  • @luxus/hindsight-monitoring — Grafana otel-lgtm observability stack, Grafana on port 3000, OTLP on 4317/4318.
  • @luxus/cloudflared — Cloudflare tunnel exposing h1.luxus.aihttp://localhost:8888.
  • hindsight-mcp-oauth-worker — a Cloudflare Worker at mcp.luxus.ai that adds GitHub OAuth → JWT in front of the Hindsight MCP endpoint.

This is reference documentation for the running system; the code is the source of truth.

flowchart LR
client["MCP / API client"]
worker["Cloudflare Worker<br/>mcp.luxus.ai<br/>GitHub OAuth → JWT"]
cf["cloudflared tunnel<br/>h1.luxus.ai"]
api["hindsight-api<br/>(Podman) :8888"]
pg[("PostgreSQL :5432<br/>pgvector · db: hindsight")]
deepseek["DeepSeek LLM API"]
graf["Grafana otel-lgtm<br/>:3000 · OTLP :4318"]
client -->|"Bearer JWT"| worker
worker -->|"injects API key, /mcp"| cf
client -->|"Bearer API key"| cf
cf --> api
api -->|"DATABASE_URL"| pg
api -->|"completions"| deepseek
api -->|"OTLP traces :4318"| graf
graf -->|"scrape /metrics :8888"| api

hindsight-api (services/hindsight-api/default.nix)

Section titled “hindsight-api (services/hindsight-api/default.nix)”
  • Runs the official image ghcr.io/vectorize-io/hindsight:0.6.2@sha256:f0f9e9a… via Podman + podman-compose (Podman is preferred over Docker; dockerCompat provides a docker CLI shim).
  • Listens on host port 8888 (→ container 8888) and also publishes 9999.
  • Managed by the hindsight-api systemd oneshot service (RemainAfterExit), ordered after/requires podman.service. ExecStart runs podman-compose up -d against the Nix-pinned image (bump the digest in flake to upgrade).
  • Data persists under /var/lib/hindsight-data (owned by uid/gid 1000).
  • Firewall: opens 8888 and 9999 when openFirewall = true.

Environment is assembled in the service preStart into /run/hindsight-api/hindsight.env (consumed via compose env_file):

  • LLM provider is DeepSeek (OpenAI-compatible): HINDSIGHT_API_LLM_PROVIDER=openai, HINDSIGHT_API_LLM_BASE_URL=https://api.deepseek.com, HINDSIGHT_API_LLM_MODEL=deepseek-v4-flash. The key comes from /run/secrets/vars/deepseek-api-key/key.
  • Database URL points at the host PostgreSQL via postgresql://postgres:<urlencoded-pw>@host.containers.internal:5432/hindsight. The password is read from the postgres-password Clan var and percent-encoded with jq … @uri. Without this, Hindsight falls back to its embedded pg0, which is not wanted here.
  • OTel tracing (enabled via enableOtelTracing = true): exports to HINDSIGHT_API_OTEL_EXPORTER_OTLP_ENDPOINT=http://host.containers.internal:4318 with service name hindsight-api, environment homelab.
  • API-key auth for external access: the hindsight-api-key Clan var is decoded (sops decrypt, falling back to plain cat) and set as HINDSIGHT_API_TENANT_API_KEY, HINDSIGHT_API_MCP_AUTH_TOKEN, and HINDSIGHT_CP_DATAPLANE_API_KEY, enabling the ApiKeyTenantExtension. Without it, external access would be unprotected.
  • Cohere embeddings/reranker are intentionally disabled on cost grounds; Hindsight uses the embeddings bundled in the image.

PostgreSQL (services/postgres/default.nix)

Section titled “PostgreSQL (services/postgres/default.nix)”
  • services.postgresql on port 5432, listen_addresses = "*" for LAN access.
  • enablePgvector = true → adds the pgvector extension, sets shared_preload_libraries = "vector", and runs CREATE EXTENSION IF NOT EXISTS vector on each database in postStart.
  • Databases created on startup: hindsight, app_data.
  • The superuser (postgres) password is a Clan var (postgres-password) applied in postStart via a safe :'pw' substitution.
  • Authentication: trust for the local socket and 127.0.0.1/::1; md5 (password required) for all other IPv4/IPv6. Firewall opens 5432 on the LAN.

Hindsight reaches Postgres from inside its container via host.containers.internal, which Podman maps back to the host.

hindsight-monitoring (services/hindsight-monitoring/default.nix)

Section titled “hindsight-monitoring (services/hindsight-monitoring/default.nix)”
  • Runs grafana/otel-lgtm:0.9.3@sha256:325a40d… (Grafana + Prometheus/Mimir + Tempo + Loki) via Podman, as a oneshot systemd service ordered after podman.service and (wants) hindsight-api.service.
  • Published ports: Grafana 3000, OTLP gRPC 4317, OTLP HTTP 4318.
  • Grafana runs with anonymous admin access and the login form disabled (GF_AUTH_ANONYMOUS_ENABLED=true, GF_AUTH_ANONYMOUS_ORG_ROLE=Admin, GF_AUTH_DISABLE_LOGIN_FORM=true).
  • Prometheus scrapes host.containers.internal:8888/metrics every 5s and accepts OTLP push; bundled dashboards (RED metrics, JVM, plus Hindsight Operations / LLM / API Service from ./dashboards/) are provisioned read-only.
  • Local network only — firewall opens just the Grafana port; do not route it through the Cloudflare tunnel.

This is the receiving end of the API’s OTel traces (:4318) and also the metrics scraper (:8888).

cloudflared (services/cloudflared/default.nix)

Section titled “cloudflared (services/cloudflared/default.nix)”
  • Runs a custom cloudflared-tunnel systemd service (the NixOS services.cloudflared module is disabled in favor of an explicit unit).

  • Tunnel ID + credentials JSON come from the cloudflare-tunnel-creds Clan var generator (tunnel-id non-secret, credentials-json secret).

  • Ingress is generated from the ingress attrset into a YAML config with a trailing http_status:404 catch-all. For Hindsight the relevant mapping (set in flake.nix) is:

    Hostname Backend
    h1.luxus.ai http://localhost:8888

    (Other ingress entries: hb/bb/gy.luxus.ai, a1.luxus.ai → SSH.)

So h1.luxus.ai exposes the raw Hindsight API; clients that can send Authorization: Bearer <api-key> use it directly.

MCP OAuth Worker (services/hindsight-mcp-oauth-worker/)

Section titled “MCP OAuth Worker (services/hindsight-mcp-oauth-worker/)”

A Cloudflare Worker (TypeScript, deployed with wrangler) that provides an OAuth facade so browser-based MCP clients (ChatGPT, Perplexity, etc.) can authenticate without ever seeing the internal Hindsight API key.

Routing (wrangler.jsonc / README.md):

  • Route mcp.luxus.ai/*, ISSUER = https://mcp.luxus.ai.
  • Upstream HINDSIGHT_ORIGIN = https://h1.luxus.ai, MCP path /mcp.

OAuth + JWT flow (src/index.ts):

  1. Standard OAuth metadata is served at /.well-known/oauth-authorization-server and /.well-known/oauth-protected-resource. PKCE (S256) is required; token_endpoint_auth_method is none (public clients).
  2. Dynamic client registration at POST /register (stored in the OAUTH_KV namespace). Redirect URIs must be https: or localhost.
  3. GET /authorize stores the pending request in KV and redirects to GitHub (scope=read:user user:email read:org).
  4. GET /callback exchanges the GitHub code, fetches the user, and enforces allow-lists: GITHUB_ALLOWED_USERS (luxus) and GITHUB_ALLOWED_ORGS (luxusAi). On success it mints a short-lived auth code in KV.
  5. POST /token verifies the PKCE challenge and issues an HS256 JWT signed with JWT_SECRET (1-hour access_token, aud = <issuer>/mcp).
  6. Requests to /mcp are proxied by proxyMcp: it verifies the Bearer JWT, then forwards to https://h1.luxus.ai/mcp after injecting Authorization: Bearer <HINDSIGHT_API_KEY> and identity headers (x-bank-id, x-forwarded-user, x-forwarded-email). On invalid/missing JWT it returns 401 with a WWW-Authenticate resource-metadata pointer.

Bank mapping: DEFAULT_BANK_ID = global-luxus (all OAuth users share one bank). Setting DEFAULT_BANK_ID=github-login maps each login to github-<login> via the X-Bank-Id header.

Worker secrets (set with wrangler secret put, not in source): HINDSIGHT_API_KEY (matches the Hindsight tenant API key), GITHUB_CLIENT_ID, GITHUB_CLIENT_SECRET, JWT_SECRET. State lives in the OAUTH_KV namespace.

Port Component Exposure
8888 hindsight-api LAN (firewall) + h1.luxus.ai via tunnel
9999 hindsight-api (secondary) LAN
5432 PostgreSQL (pgvector) LAN only
3000 Grafana (otel-lgtm) LAN only
4317 / 4318 OTLP gRPC / HTTP ingest host/container-local
  1. A client either talks to h1.luxus.ai directly with the Hindsight API key, or to mcp.luxus.ai/mcp with a GitHub-OAuth-issued JWT.
  2. The Worker validates the JWT, swaps in the internal API key, and proxies to h1.luxus.ai/mcp; the cloudflared tunnel delivers it to localhost:8888.
  3. hindsight-api authenticates via its tenant API key, stores/queries vectors in the hindsight database on PostgreSQL (pgvector), and calls DeepSeek for LLM completions.
  4. The API exports traces over OTLP to the otel-lgtm container (:4318), which also scrapes its /metrics (:8888); Grafana on :3000 visualizes both.
Secret Source Consumed by
postgres-password Clan var hindsight-api DATABASE_URL, Postgres superuser
deepseek-api-key Clan var (/run/secrets/vars/...) hindsight-api LLM key
hindsight-api-key Clan var (sops) hindsight-api tenant/MCP/dataplane auth
cloudflare-tunnel-creds Clan var (id + credentials JSON) cloudflared tunnel
HINDSIGHT_API_KEY Worker secret (wrangler) Worker → upstream injection
GITHUB_CLIENT_ID / GITHUB_CLIENT_SECRET Worker secret GitHub OAuth
JWT_SECRET Worker secret HS256 JWT signing/verification
Concern File
Service inventory & tunnel ingress flake.nix
Hindsight API container & env services/hindsight-api/default.nix
PostgreSQL + pgvector services/postgres/default.nix
Observability stack services/hindsight-monitoring/default.nix
Cloudflare tunnel services/cloudflared/default.nix
MCP OAuth facade services/hindsight-mcp-oauth-worker/