Architecture — VelaOS Docs

How the four pieces of VelaOS talk to each other — and what happens when the network drops.

The four components

VelaOS system topologytext

┌───────────────────────────────────────────────────────────────┐
│                        ADMIN'S BROWSER                        │
│                 console.velaos.ch (Next.js 16)                │
└──────────────────────────────┬────────────────────────────────┘
                               │ HTTPS + httpOnly JWT cookie
                               │
┌──────────────────────────────▼────────────────────────────────┐
│                    VelaOS CLOUD API                           │
│              api.velaos.ch (Go + Echo v5)                     │
│                                                               │
│  ┌──────────┐  ┌───────────┐  ┌──────────┐  ┌──────────────┐  │
│  │ Supabase │  │ EMQX MQTT │  │ Upstash  │  │ Supabase     │  │
│  │ Postgres │  │  (8883)   │  │  Redis   │  │ Storage      │  │
│  └──────────┘  └─────┬─────┘  └──────────┘  └──────────────┘  │
└─────────────────────┬┴────────────────────────────────────────┘
                      │  TLS + mTLS cert per device
                      │  QoS 0 for heartbeats
                      │  QoS 1 for commands + reports
                      │
┌─────────────────────▼──────────────────────────────────────────┐
│                    VelaOS AGENT                                │
│            (Kotlin + Jetpack Compose, Device Owner)            │
│                                                                │
│    Publishes:                     Subscribes:                  │
│    /heartbeat   (60s)             /commands                    │
│    /response    (per ack)         /policy                      │
│    /policy_report  (5m)                                        │
│    /apps_report    (5m)                                        │
└────────────────────────────────────────────────────────────────┘
                      │
┌─────────────────────▼──────────────────────────────────────────┐
│                 VelaOS Base Image                              │
│   AlmaLinux 10.1 bootc · kernel 6.12 LTS · Greenboot + Podman  │
│   UEFI x86-64-v2 (primary) · aarch64 (secondary)               │
└────────────────────────────────────────────────────────────────┘

Data flows

Enrollment (one-time per device)

Device boots, contacts api.velaos.ch/api/v1/enrollment/code over HTTPS
API returns a 6-character VelaOS Code + MQTT broker address + per-device TLS cert
Device shows code on screen; admin types it into the console
Cloud moves device from pending to approved and pushes approved command
Device subscribes to its MQTT topics and syncs current policy

Heartbeat (every 60 seconds, cloud-tunable)

Every device publishes to vela/{device_id}/heartbeat:

{
  "ts": 1744545600,
  "cpu_temp": 52.3,
  "cpu_usage": 18,
  "ram_used": 2100000000,
  "ram_total": 8000000000,
  "storage_used": 4100000000,
  "storage_total": 31000000000,
  "wifi_signal_dbm": -62,
  "thermal_zones": [{"name": "cpu", "temp_c": 52.3}],
  "cpu_governor": "schedutil",
  "fan_rpm": 1800,
  "power_sources": [{"source": "USB", "voltage_v": 5.1, "current_a": 1.9}],
  "i2c_devices": [...],
  "bluetooth_devices": [...],
  "usb_devices": [...]
}

The interval is configurable via policy.agent.heartbeat_interval_seconds.

Command dispatch (real-time)

Admin clicks "Reboot" in console
Console POSTs /api/v1/devices/:id/reboot
API validates RBAC (device.reboot permission), writes audit log
API publishes to vela/{device_id}/commands with QoS 1
Agent receives, executes, publishes ack to /response
Console gets ack via WebSocket within ~100ms

Policy application

Admin saves a policy in the editor
API computes the effective policy for every device (tenant default → group → device override)
API publishes policy_update commands with the merged JSON
Agent calls PolicyEngine.applyPolicy(json) — runs 75+ DPM API calls
Agent publishes a compliance report within 5 seconds

Protocol choices

MQTT over TLS (port 8883) — chose over gRPC/HTTP long-polling because: persistent connection, tiny overhead per message, built-in last-will (we know instantly when a device disconnects), QoS 1 for at-least-once command delivery.
Per-device mTLS cert — the EMQX broker validates every connection against a tenant-specific CA. A stolen device can't impersonate another.
HTTPS for enrollment + file downloads — enrollment needs to work before MQTT is set up. File downloads use presigned URLs (1-hour expiry).
WebSocket for console live updates — the console opens one WS to the API, which fans out device events it observed on MQTT.

Failure domains

What survives what failure

Agent process crashes — Android restarts it within 5s (foreground service)
Network drops — agent queues heartbeats, retries every 30s with backoff, applies last-known policy
Cloud API down — devices stay operational on cached policy; can't receive new commands
MQTT broker down — same as above; commands resume once broker returns
Postgres down — API returns 503; devices unaffected
Device lost entirely — 30 min offline triggers alert; admin can remote-wipe or write-off

Where data lives

Postgres (Supabase, eu-central-2) — tenants, devices, groups, policies, audit log, compliance results
Redis (Upstash) — session cache, rate limit counters, enrollment code TTLs
Supabase Storage — APK files, OTA images, diagnostic bundles, screenshots, wallpapers
Agent SecurePrefs (on-device) — enrollment config, last applied policy, agent tunables (encrypted AES-256-GCM)

Next steps

Security model — Device Owner privileges, RBAC, certificate chain
Device lifecycle — every state a device can be in
MQTT topics reference — complete topic taxonomy