Architecture

ClawKeep Cloud runs entirely on Cloudflare's edge network. Zero-knowledge encryption means the server stores only ciphertext — we can't read your data even if we wanted to.

Infrastructure

Layer	Technology	Purpose
API	Cloudflare Workers (Hono)	Auth, workspaces, billing, credential generation
Frontend	Next.js 14 on Cloudflare Pages	Dashboard, marketing, docs
Database	Cloudflare D1 (SQLite)	Users, workspaces, keys, usage, audit log
Storage	Cloudflare R2 (S3-compatible)	Encrypted backup chunks ($0 egress)
Cache	Cloudflare KV	Rate limiting, session data
Payments	Stripe	Subscriptions, usage-based billing

Zero-knowledge encryption

All encryption happens on your machine before data leaves it. The server stores only encrypted blobs.

Key derivation — Your password is used to derive an encryption key via Argon2id. The key never leaves your device.
Chunking — Files are split into content-defined chunks for deduplication.
Encryption — Each chunk is encrypted with AES-256-GCM. A random nonce is generated per chunk.
Manifest — A manifest file records the mapping of original files to encrypted chunks. The manifest itself is encrypted.
Upload — Only ciphertext reaches the server. Chunk names are hashes, not file names.

Lost password = lost data. There is no recovery mechanism by design. This is the tradeoff for actual privacy.

R2 storage layout

clawkeep-prod/
└── workspaces/
    └── ws_01jkxyz.../
        ├── manifest.enc        # Encrypted manifest
        ├── chunk-000001.enc    # Encrypted data chunks
        ├── chunk-000002.enc
        └── ...

Direct upload (no proxy)

The CLI does not proxy uploads through the API. Instead:

CLI requests scoped S3 credentials via GET /api/workspaces/:id/credentials
API returns time-limited credentials restricted to that workspace's R2 prefix
CLI uploads/downloads directly to R2 using standard S3 protocol

This means backups are fast (direct to storage, no middleman) and the API never touches your data.

Authentication

Web dashboard (JWT)

1. POST /api/auth/login → { access_token, refresh_token }
2. access_token stored in memory (15min expiry)
3. refresh_token set as httpOnly cookie (7 day expiry)
4. On 401 → POST /api/auth/refresh → new access_token
5. Retry original request

Access tokens are JWTs signed with HS256. The payload contains user ID, email, and plan. Refresh tokens are stored as SHA-256 hashes in D1.

CLI (API key)

1. User creates key in dashboard → ck_live_xxxxxxxxxxxxxxxxxxxx
2. Full key shown once, bcrypt hash stored in DB
3. CLI sends key in Authorization header
4. API extracts prefix → looks up in DB → bcrypt compare
5. Request authenticated

API keys use the ck_live_ prefix + 48 random hex chars. They never expire but can be revoked from the dashboard.

ID generation

All entity IDs use prefixed ULIDs — sortable, unique, and type-identifiable:

Prefix	Entity	Example
`usr_`	User	`usr_01h5x3kj7p4m2n6r`
`ws_`	Workspace	`ws_01jkxyz9a8b7c6d5`
`key_`	API Key	`key_01h5abc123def456`
`rt_`	Refresh Token	`rt_01h5token789xyz`
`cred_`	Credential	`cred_01h5cred456abc`
`usage_`	Usage Record	`usage_01h5usage123`
`audit_`	Audit Log	`audit_01h5audit789`

Database schema

Cloudflare D1 (SQLite at the edge). 9 tables total.

Table	Purpose
`users`	Accounts, plan info, Stripe customer ID
`workspaces`	One per project. Tracks storage, chunk count, last sync
`api_keys`	Hashed keys with prefix for lookup
`refresh_tokens`	SHA-256 hashed tokens with expiry
`workspace_credentials`	Scoped S3 credentials (24h expiry)
`usage_daily`	Daily aggregated usage per user
`audit_log`	User action log (90-day retention)
`email_verification_tokens`	Email verification
`password_reset_tokens`	Password reset

All timestamps are Unix epoch integers. Migrations live in infra/migrations/.

Worker bindings

# wrangler.toml
[[d1_databases]]
binding = "DB"

[[r2_buckets]]
binding = "STORAGE"

[[kv_namespaces]]
binding = "KV"

# Secrets (set via wrangler secret put)
# JWT_SECRET
# STRIPE_SECRET_KEY
# STRIPE_WEBHOOK_SECRET