Security and privacy
Six articles in, you have a stack. This one is the audit pass: where the data actually lives, who can reach it, and the small set of mistakes that account for most of the trouble in setups of this shape.
What this is
Security writing tends to fail in two directions: the compliance-flavored checklist that treats every threat as equally urgent and leaves you exhausted before you've improved anything, and the lone tip — "use a strong password" — that sounds correct and changes nothing. This article walks the stack you've built across articles 2–6 the way an honest auditor would: what data sits where, what trust boundaries you've actually drawn, and which mistakes do most of the damage.
Scope: a personal or small-team deployment. The threat model assumes a single home box behind a residential connection, optionally reachable through a tunnel. Regulated-industry deployments have the same shape and stricter controls; the pro callout in §8 names the questions that change.
A threat model worth defending against
A threat model is the document — even if it lives in your head — that names who you're worried about and what they'd have to do to win. Skipping this step is how you end up with a four-factor login on a service nobody can reach and an unsecured Ollama port on the same machine.
Four adversaries are worth thinking about. The rest of the article is organized around blunting them.
- The internet scanner. An automated script working its way through IPv4 space, looking for default-configured services. They don't know who you are, and they don't need to. If port 11434 is open and answers without a password, they'll find it within hours. SentinelLabs and Censys counted 175,000 publicly-exposed Ollama hosts across 130 countries in late 2025; Cisco Talos found a sample where roughly 20% had no authentication of any kind. The defense is keeping the listener off the public network in the first place.
- The targeted attacker. Rarer, more capable: someone who knows what you run because they read about it on your blog, and is prepared to spend time. The relevant CVEs become directly applicable — Ollama's CVE-2024-37032, a path-traversal-to-RCE that affected versions before 0.1.34, is the canonical example of "default port, no auth, full takeover." The defense is patching promptly and not assuming local code is trusted code.
- The vendor. Cloudflare in the data path, OpenRouter handling cloud inference, whoever ships your Docker images. None are malicious; all can see things you might not want them to, all can be compromised, all can change their terms. The defense is knowing what you've handed each one, and minimising what you don't need to.
- You. Specifically, the version of you that copy-pastes a config off Stack Overflow at midnight, leaves a port open during a debugging session, or stores an API key in
~/.bash_history. By a wide margin the most likely cause of a small-stack breach, and the only adversary you can fully retrain.
Two adversaries deliberately left off: nation-state actors and physical access. If either is in your threat model, this article isn't the one — and the controls you'd need don't look much like a Docker stack anyway.
Where your data actually lives
Inventory before you defend. Write down every place a piece of conversation or query touches disk, and which trust zone that place is in. Mine looks like this.
Disk, RAM, journalctl. Compromise here is total — anything you've ever typed into the chat is here in plaintext.
Sees plaintext while terminating TLS. Doesn't store request bodies by default, but can be asked to. Holds the auth identity of who reached you and when.
Each has its own data policy. OpenRouter's default is zero retention for prompts and completions; upstream providers vary by model and require their own check.
The chat database is the highest-value artifact in the stack. By default, Open WebUI stores it as a single SQLite file at /app/backend/data/webui.db inside the container, which maps to whatever host volume you mounted in article 2. It contains every conversation, every uploaded document's full text, every API key you pasted into a model connection, and the bcrypt hashes for any account that can log in. If a backup of that file leaves the house, the conversation history left with it, not "some config."
| What | Where | What it reveals |
|---|---|---|
| Chat DB | open-webui/data/webui.db | Every message, uploaded file text, stored API keys, password hashes. |
| Secret key | WEBUI_SECRET_KEY (env) | Signs JWTs and encrypts OAuth tokens at rest. Set explicitly; otherwise a new one generates per restart and everyone gets logged out. |
| Model weights | ollama/.ollama, llama/models | Nothing sensitive in themselves, but they identify what you run. |
| SearXNG cache | searxng/redis | Recent search queries. Configurable; default retention is short. |
| nginx logs | docker logs nginx | URLs visited, IPs (Cloudflare's, unless you forward the real one), timestamps. |
| Tunnel token | .env file | Anyone with this string can register themselves as the tunnel and intercept inbound traffic. |
| OpenRouter key | webui.db (chat DB) | Billing access to your account. Rotate if the DB ever leaves the host. |
WEBUI_SECRET_KEY is the env var Open WebUI uses to sign JWTs and encrypt OAuth tokens at rest. Set it persistently in your Compose file; the default of "generate a fresh one every restart" logs everyone out on each redeploy.
The host
The container hardening in the next section assumes the host underneath is reasonable. On Ubuntu 24, "reasonable" is a small list — none of it new, all of it skippable, none of it actually skippable.
SSH off the public internet. If sshd is reachable from anywhere except your LAN, password auth is off and key auth is the only path. The PasswordAuthentication no setting in /etc/ssh/sshd_config is the one that matters; the rest is variations on a theme. If you don't need SSH from outside the house, bind it to the LAN address and leave it there — the tunnel from article 6 doesn't need it.
# /etc/ssh/sshd_config — the lines that matter PasswordAuthentication no PermitRootLogin no # bind to LAN only if you don't need WAN SSH ListenAddress 192.168.x.x
Firewall on, default deny. UFW is fine for this. The interesting move is that with Cloudflare Tunnel in front, you don't need any inbound ports open at all on the WAN side — cloudflared dials out. Inside the LAN, allow the few things you actually use (SSH from your laptop's subnet, maybe a Samba share) and deny the rest.
$ sudo ufw default deny incoming $ sudo ufw default allow outgoing $ sudo ufw allow from 192.168.0.0/16 to any port 22 $ sudo ufw enable
Unattended security upgrades. The default Ubuntu package is configured for security-only updates and is the right floor. You can argue with the policy at the margins; the version where you never run it loses by a wide margin. Verify it's actually running — it's been known to silently fail when the unattended-upgrades log fills the disk.
$ sudo apt install unattended-upgrades $ sudo dpkg-reconfigure --priority=low unattended-upgrades $ sudo unattended-upgrade --dry-run --debug | tail -20
A user that isn't root and isn't in docker. Membership in the docker group is functionally equivalent to root — the daemon will happily mount the host filesystem into a container for you. That's a useful default for the operator account, but it means any process that compromises a shell as that user has won. If you run other services on this box (a media server, a browser session), give them their own user.
The thing that doesn't show up on most checklists but matters here: watch your disk. Open WebUI's database and nginx logs will grow without bound if you let them. A full disk takes down the tunnel, the chat interface, and unattended upgrades in one go, and recovering from it at 11pm is no fun.
The stack
On a single-tenant box, container security is less about isolating workloads from each other and more about reducing the blast radius when one image has a bad day.
No host ports on internal services. Article 6 already establishes this for nginx: the only public-facing piece is cloudflared, which dials out. Ollama, llama-swap, SearXNG, Open WebUI itself — none of them need ports: in the Compose file. They reach each other over the Docker network by service name. The 11434 exposures in the Cisco study were not exotic attacks; they were defaults that someone forgot to override.
The default to never accept
If Ollama runs on the host (not in Docker), set OLLAMA_HOST=127.0.0.1:11434 explicitly. It binds to localhost by default, but the moment you set 0.0.0.0 to test something from another machine and forget to revert it, the box is in the next scanner sweep. There is no built-in auth.
Secrets in .env, not in docker-compose.yml. Compose references like ${OPENROUTER_API_KEY} pull from a .env file that should be chmod 600 and excluded from version control. If you ever commit a key by accident, rotating it is a five-minute job; if you ever commit the chat database, the only honest fix is to consider those messages public.
# .gitignore — the floor
.env
.env.*
!.env.example
data/
*.db
Pinned images, considered updates. Most services in this series ship with the :latest tag, on the assumption that staying current beats running last quarter's CVE list. That's defensible for a single-user box, and stops being defensible the moment something downstream depends on a specific behavior. The realistic middle: pin where breakage hurts (Open WebUI, the briefing system, anything custom); track latest on the edge and infrastructure pieces (cloudflared, nginx) where the security win is staying current.
The supply-chain wrinkle. Pulling updates fast means patching CVEs fast. It also means pulling poisoned releases fast, and the last two years have a clear pattern of those. The XZ Utils backdoor was a multi-year social-engineering operation against a single overworked maintainer, caught in March 2024 days before it would have shipped in Debian stable. The Shai-Hulud worm self-propagated through npm starting September 2025, eventually compromising over 500 packages by stealing maintainer tokens and using them to publish to whichever further packages those tokens could reach. In both cases the supply chain itself was the attack, and "auto-update" was the propagation mechanism. These are the named incidents in a much busier background — the "Mini Shai-Hulud" wave in May 2026 swept up another 314 npm packages by the same token-republish pattern, and smaller compromises land between the named ones often enough that they no longer reach the front page.
For a personal stack, "stop updating" loses more than it gains. The practical move is to know what you depend on, and at what cadence you'd rather take which risk.
- Pull from publishers you can name.
cloudflare/cloudflared,nginx,searxng/searxnghave real reputations to defend. A randomsomeuser/cool-toolon Docker Hub is code with full container privileges running in your house, and that channel is where almost every small-stack supply-chain compromise has landed. - Delay non-critical updates a beat. Continuous auto-pull turns a maintainer-account compromise into a compromise on your box in minutes. A nightly or twice-weekly
docker compose pullstretches those minutes to 12–72 hours — usually enough for a high-profile incident to surface in the news before your machine redeploys. Cron a pull job; don't trigger one on every push. - Read release notes for anything that matters. Five minutes per non-trivial bump on Open WebUI, the briefing system, cloudflared. You won't spot a deliberate backdoor in CHANGELOG.md, but you'll catch the obvious red flags — unfamiliar new dependencies, a sudden broader permission ask — that are worth waiting a few days on.
The cron pull pattern from article 2 (nightly docker compose pull && docker compose up -d) sits at the right point on this curve for a personal stack. The category of tool that auto-redeploys on every upstream push — popular in homelab circles a few years ago — optimizes for the wrong tradeoff in the current threat environment. Blast radius of a compromised upstream is now the dominant cost, and closing the window from minutes to next-day is most of the win.
Image signing with Sigstore/cosign and SLSA build provenance are the next layer of defense — the registry can prove the image you pulled was built from the source you expected, by the org you expected. For an individual stack it's premature; for a team running anything customer-facing on this same pull-and-restart pattern, it's where the next level of seriousness lives.
The Docker socket is the crown jewel. Anything with /var/run/docker.sock mounted into it can spawn privileged containers and read every other container's volumes. Container management UIs and most auto-update daemons want this. If a container has the socket, that container is part of your trusted compute base, and a CVE in it is equivalent to a host compromise. Treat the list of socket-mounted containers as a list to keep short and review.
Power user: a defense-in-depth pattern for Open WebUI
Three layers, each one independently sufficient for the most common attack class:
- Cloudflare Access in front: an unauthenticated request never reaches your box. This is the layer that defends against scanner-driven exploitation of a future Open WebUI CVE.
- Open WebUI's own auth behind it: defends against the lateral attacker — someone you let through Access (a family member) who shouldn't be able to read your conversations.
- Disable signup after creating the admin (
ENABLE_SIGNUP=false). Open WebUI disables it automatically whenWEBUI_ADMIN_PASSWORDis set and no users exist, but the explicit env var is worth setting anyway — a future you who edits the config won't wonder.
An honest gap: Open WebUI stores API keys for downstream providers in its database in a form that's encrypted with WEBUI_SECRET_KEY. That key is plaintext in your Compose env. If both the database and the env file leave the host together — a single careless backup — the keys leave with them. Backup encryption (next section) is the answer; rotating downstream keys after any suspected DB exposure is the discipline.
The edge
With a tunnel in front of the stack, your security model includes a vendor whose product you mostly don't see. What Cloudflare actually sees and does, in this configuration:
Plaintext at the edge. TLS terminates at Cloudflare, which is what makes the dashboard work and what makes Access policies enforceable. It also means every request body — every prompt, every uploaded document — exists in plaintext on a Cloudflare server, briefly, before being re-encrypted to your tunnel. Cloudflare does not, by their stated policy, retain request bodies. They can be configured to (Logpush of HTTP request fields is a paid feature, off by default). That gap between "by default" and "by policy" is the one you accept when you put a CDN in front of your stack. If you can't accept it, article 6 named the alternatives.
What Access logs. Successful and failed auth events, the email that authenticated, the IP they came from, the hostname they reached. This is the layer that gives you "someone tried to get in from Romania at 3am" rather than "something happened." Keep it; it's the only forensic record you've got at the edge.
Session length is a real choice. The default of 24 hours is convenient. A shorter session — 8 hours, an hour — means a stolen device gets you fewer hours of access before re-auth. The actual question is how often you're willing to re-auth on your own devices, because that's the cost. I run 24h and accept the trade; "30 days" is the answer I'd argue against.
Geographic and device posture rules. Access can require a request to come from a specific country or set of countries, from a Cloudflare WARP-enrolled device, or from a known device certificate. For a single-user setup this is over-engineered; for a small team where every member is in one or two known countries, blocking everywhere else costs you nothing and removes most automated traffic in one rule.
Backups, and the part nobody backs up
The risk that gets the most security attention is the breach. The risk that actually happens is the disk failure, the rm -rf in the wrong directory, the upgrade that ate the database. The audit pass isn't done until you have a backup you've actually restored from.
For this stack, three categories of data, each with a different policy.
- Back up: the Open WebUI data volume (chat history, settings, API keys), your
docker-compose.ymlandnginx/conf.d/, the.envfile, any llama-swap config, the SearXNGsettings.yml. Together this is a few hundred megabytes and rebuilds the whole stack. - Don't bother: model weights. They're 4–60GB each, deterministically re-downloadable, and storing them in a backup is paying for cold storage of public files. Keep a list of which models you'd want to pull back; that's the actual restore artifact.
- Treat as crown jewels: the chat database, encrypted before it leaves the host. Anything else, the cost of leaking is small. The chat database is everything you've ever asked an AI to read for you.
A working pattern: restic for the encryption and incrementals, an off-box destination (S3-compatible, a Backblaze B2 bucket, an attached drive that isn't always attached). The point of off-box is ransomware — anything that can write to your backup destination from the same box can encrypt it. If your stack runs as docker-group user, a backup that that user can also overwrite is one compromise away from being useless.
# crude but useful: a daily backup of just the data that matters # run as a user that can write the repo but can't delete versions restic -r b2:om-backups:stack backup \ ./open-webui/data \ ./searxng/settings.yml \ ./nginx/conf.d \ ./docker-compose.yml \ ./.env \ --tag stack --tag $(date +%F) restic -r b2:om-backups:stack forget --keep-daily 14 --keep-monthly 6 --prune
One step that's easy to skip and costs nothing: restore once, on a different host, before you trust the backup. The number of backups that look correct and aren't is a steady fraction of all backups ever taken; the only way to find out which kind you have is to do the dry run.
The mistakes that bite
A checklist organized by what actually goes wrong in setups of this shape. Skim it once after a build; revisit it after any change that touches the network or the env file.
- Ollama listening on 0.0.0.0. Most common single misconfiguration in this category.
curl http://your-wan-ip:11434/api/tagsfrom a phone on cellular tells you in one second whether you've done this. - WEBUI_SECRET_KEY unset. Sessions survive restarts; if yours don't, the key is being regenerated each boot. Set it in
.env, generate it withopenssl rand -hex 32, never rotate it casually. - Signup left enabled. Once your admin exists, set
ENABLE_SIGNUP=false. If Open WebUI is behind Access, this is belt-and-braces; if it isn't, it's the only thing between a bored attacker and a working account. - OpenRouter or model keys in chat messages. Pasted into a "how do I" question, then forgotten. They're in the chat DB now, in plaintext to anyone who reads it. Rotate any key that has ever been in a message; the chat search is the first place an attacker with DB access will look.
- A
.envwith the wrong permissions.chmod 600 .env, owned by your user. Default644means every user on the box (and every container that maps that path) can read your API keys. - Cloudflare Access "Bypass" policy left after debugging. A bypass rule disables auth for matching requests. Useful for a public webhook; catastrophic if you forgot why you added it. Audit the application's policies after any change.
- OpenRouter logging on for the 1% discount. The default is off; the discount toggle changes that. The cost of the discount is your prompts and completions being retained for training-data purposes. Their ZDR policy is the thing to keep on if you'd ever want to put non-trivial work through the API.
- Backups on the same disk. An
rsyncto/backupon the same drive is theater, not backup. Off-box, encrypted, append-only at the destination if you can manage it. - Letting model weights and config drift. If you rebuild from a backup six months from now and your Open WebUI version has bumped, the schema migration runs. Usually fine; occasionally not. Pin the Open WebUI version in Compose, and keep a note of which version each backup was taken against.
- Trusting a Docker image you've never read about. A random
image: someuser/cool-tool:latestpulled from Docker Hub is code with full container privileges. The supply-chain attacks of the last few years have generally landed through exactly this channel. Stick to official images, well-known maintainers, or fork-and-build for anything else. - Redeploy-on-every-push update loops. Auto-pull tools that ship every upstream push the moment it lands close the patch window to seconds — and the supply-chain window to the same seconds. The nightly cron pull from article 2 plus a quick scan of release notes on any non-trivial bump is the cadence anything load-bearing wants.
Where this fits
Security posture is the thing you can't bolt on after the fact. The stack you've built has good bones for it: outbound-only at the edge, no host ports on internal services, identity at the gate.
Article 8 — image generation — adds ComfyUI to the stack. It's another service that takes user input and produces files, and another candidate for an exposed hostname.
For a small team or organization
The threat model shifts in one direction when other people use the box: every adversary in §1 gains a new attack surface (a coworker's laptop, a contractor's account, a former employee's session) and the "you" category gets bigger because there are more of you. Three questions to settle before standardising this for a team:
- Identity provider. Email PIN is fine for one or two people. For five or fifty, you want Access wired to whatever directory the org already runs — Google Workspace, Microsoft Entra ID, Okta. De-provisioning a leaver should be a single change in the directory, not five rules in Cloudflare.
- Data residency and retention. If your sector has rules about where prompts can travel and how long they can live, the audit answer is concrete: chat data is at-rest in the host country and in the Cloudflare edge POPs your traffic transits; prompts to OpenRouter go to whichever upstream provider the model belongs to. Those answers map to actual jurisdictions and actual policies; both should be in the vendor file.
- Compliance attestation. Cloudflare publishes a compliance hub with current SOC 2, ISO 27001, GDPR, and sector-specific status. OpenRouter is a thin layer over upstreams; the attestation that matters is each upstream's. Don't conflate "the model provider is HIPAA-eligible" with "your traffic to them is covered" — the latter requires a BAA you've actually signed.
- Supply-chain posture. The right cadence isn't "patch immediately" or "never patch" — it's a written policy that says which dependencies update on which schedule, who owns the call, and what triggers an emergency push. At minimum: a 24–48 hour delay on non-critical updates so a high-profile compromise has time to surface; signed images verified at pull time where the upstream supports it; and a short list of the maintainers and orgs you actually trust upstream of the registry. The next round of attacks won't look like the last; the policy is what keeps the response consistent.
A modest investment most teams skip: write the operational policy down in one page. Who has admin. Who can add a user. What gets backed up and where. What you'd do if a laptop with an active Access session were stolen. The doc itself is the artifact; the process of writing it is what catches the gaps.