Architecture

ClawKeep Cloud runs entirely on Cloudflare's edge network. Zero-knowledge encryption means the server stores only ciphertext — we can't read your data even if we wanted to.

Infrastructure

LayerTechnologyPurpose
APICloudflare Workers (Hono)Auth, workspaces, billing, credential generation
FrontendNext.js 14 on Cloudflare PagesDashboard, marketing, docs
DatabaseCloudflare D1 (SQLite)Users, workspaces, keys, usage, audit log
StorageCloudflare R2 (S3-compatible)Encrypted backup chunks ($0 egress)
CacheCloudflare KVRate limiting, session data
PaymentsStripeSubscriptions, usage-based billing

Zero-knowledge encryption

All encryption happens on your machine before data leaves it. The server stores only encrypted blobs.

  1. Key derivation — Your password is used to derive an encryption key via Argon2id. The key never leaves your device.
  2. Chunking — Files are split into content-defined chunks for deduplication.
  3. Encryption — Each chunk is encrypted with AES-256-GCM. A random nonce is generated per chunk.
  4. Manifest — A manifest file records the mapping of original files to encrypted chunks. The manifest itself is encrypted.
  5. Upload — Only ciphertext reaches the server. Chunk names are hashes, not file names.

Lost password = lost data. There is no recovery mechanism by design. This is the tradeoff for actual privacy.

R2 storage layout

clawkeep-prod/
└── workspaces/
    └── ws_01jkxyz.../
        ├── manifest.enc        # Encrypted manifest
        ├── chunk-000001.enc    # Encrypted data chunks
        ├── chunk-000002.enc
        └── ...

Direct upload (no proxy)

The CLI does not proxy uploads through the API. Instead:

  1. CLI requests scoped S3 credentials via GET /api/workspaces/:id/credentials
  2. API returns time-limited credentials restricted to that workspace's R2 prefix
  3. CLI uploads/downloads directly to R2 using standard S3 protocol

This means backups are fast (direct to storage, no middleman) and the API never touches your data.

Authentication

Web dashboard (JWT)

1. POST /api/auth/login → { access_token, refresh_token }
2. access_token stored in memory (15min expiry)
3. refresh_token set as httpOnly cookie (7 day expiry)
4. On 401 → POST /api/auth/refresh → new access_token
5. Retry original request

Access tokens are JWTs signed with HS256. The payload contains user ID, email, and plan. Refresh tokens are stored as SHA-256 hashes in D1.

CLI (API key)

1. User creates key in dashboard → ck_live_xxxxxxxxxxxxxxxxxxxx
2. Full key shown once, bcrypt hash stored in DB
3. CLI sends key in Authorization header
4. API extracts prefix → looks up in DB → bcrypt compare
5. Request authenticated

API keys use the ck_live_ prefix + 48 random hex chars. They never expire but can be revoked from the dashboard.

ID generation

All entity IDs use prefixed ULIDs — sortable, unique, and type-identifiable:

PrefixEntityExample
usr_Userusr_01h5x3kj7p4m2n6r
ws_Workspacews_01jkxyz9a8b7c6d5
key_API Keykey_01h5abc123def456
rt_Refresh Tokenrt_01h5token789xyz
cred_Credentialcred_01h5cred456abc
usage_Usage Recordusage_01h5usage123
audit_Audit Logaudit_01h5audit789

Database schema

Cloudflare D1 (SQLite at the edge). 9 tables total.

TablePurpose
usersAccounts, plan info, Stripe customer ID
workspacesOne per project. Tracks storage, chunk count, last sync
api_keysHashed keys with prefix for lookup
refresh_tokensSHA-256 hashed tokens with expiry
workspace_credentialsScoped S3 credentials (24h expiry)
usage_dailyDaily aggregated usage per user
audit_logUser action log (90-day retention)
email_verification_tokensEmail verification
password_reset_tokensPassword reset

All timestamps are Unix epoch integers. Migrations live in infra/migrations/.

Worker bindings

# wrangler.toml
[[d1_databases]]
binding = "DB"

[[r2_buckets]]
binding = "STORAGE"

[[kv_namespaces]]
binding = "KV"

# Secrets (set via wrangler secret put)
# JWT_SECRET
# STRIPE_SECRET_KEY
# STRIPE_WEBHOOK_SECRET