Stack Series · 2 of 9 Practical · May 2026

Open WebUI: your AI interface

The chat window is the part you actually live in. A Docker Compose builder for Open WebUI, the first-launch gotchas worth knowing, and a three-way fork at the end: stay on cloud through OpenRouter, add Ollama for ease, or go to llama.cpp for control.

Curtis Smith · OptiMoss.ai · part of the Stack Series

§ 0

What this is

Open WebUI is a self-hosted chat interface — a web app you run on your own machine that talks to language models. It does not include a model itself; it provides the surface around one: conversation history, multiple models in tabs, file uploads, tool use, image generation hooks, RAG against your own documents, plugin support, multi-user accounts if you need them.

The stack has three layers. The interface is what you click. The backend is what runs the model. The model is the file itself. This article handles the first layer. Articles 3 and 4 handle the second. The third — picking which model — gets a treatment of its own.

Open WebUI runs as a Docker container, ships as a single image, and stores its data in a named volume. One command to start it, one command to update it, no Python toolchain on the host. If you can run Docker, you can run this.

§ 1

Before you start

A machine you control. Linux is the path of least resistance — I run Ubuntu 24.04 on a Dell Precision 7750. Open WebUI also runs fine on macOS and Windows under Docker Desktop. A small home server, a desktop workstation, or even a capable laptop will all do.
Docker and Docker Compose. If you don't have them, the Docker docs are the right starting point. Compose is bundled with current Docker installs as docker compose.
A few gigabytes of disk. The image is around 1.5 GB. Conversation history, uploaded files, and any RAG knowledge bases live in a Docker volume that grows with use.
No GPU needed for this article. Open WebUI itself doesn't run models — it talks to a backend. The hardware question comes up in article 3 when you choose where the model actually lives.

If you don't have a machine to run this on yet, you can still follow along. The Compose file you generate below will work the moment you do.

§ 2

The Compose file

The shape of the install is small: one service, one volume, four environment variables of any consequence. The builder below produces a Docker Compose service block — not a full docker-compose.yml. Subsequent articles will add services (Ollama, SearXNG, ComfyUI) to the same file, so think of this as the first entry in a growing composition.

Open WebUI · Compose service builder

Project directory

Where the docker-compose.yml lives. I use ~/openwebui because the whole stack accretes there over time. /opt/openwebui is also fine on a server.

Network exposure

Where the service listens. Localhost-only is the safe default — paired with a Cloudflare Tunnel (article 6) you get remote access without opening a port on your router. LAN-accessible is fine for a single-machine home setup behind a trusted network.

Localhost onlyRecommended. Binds to 127.0.0.1:3000. Reachable from the host machine only. LAN-accessibleBinds to 0.0.0.0:3000. Other devices on your network can reach it directly.

Host port

Port the interface answers on. 3000 is the default and matches Open WebUI's documentation. Change it only if something else on the host already uses 3000.

Auto-updates

Open WebUI ships frequently — multiple releases a month is normal. The trade-off is real: an auto-pull can drop a settings reset or a breaking change on a Saturday morning, and a hostile push to an upstream image becomes your problem in the time it takes the updater to fire. A nightly host-side cron pull is the middle ground the series settles on — slow enough that a high-profile compromise has time to surface before you redeploy, fast enough that ordinary patches don't drift. Article 7 §4 covers the supply-chain reasoning.

A note on divergence: the official Open WebUI docs still recommend Watchtower for this job. Watchtower was archived in December 2025 and no longer ships fixes for its own CVEs, and it wants the Docker socket — a privilege you'd rather not hand a stale dependency. A two-line crontab gets you the same nightly cadence without either problem, which is why the builder here defaults to it.

Nightly cron pullOne line in your crontab; the snippet is appended to the output below. Manual updates onlyYou'll run docker compose pull yourself when you want a new version.

docker-compose.yml

Save this as docker-compose.yml inside your project directory. If you chose the localhost-only bind, you'll reach the interface at http://localhost:3000 from the same machine. For LAN access, swap in the machine's IP.

$ mkdir -p ~/openwebui && cd ~/openwebui
# save the generated compose file as docker-compose.yml, then:
$ docker compose up -d

The first start pulls the image (a minute or two on a reasonable connection) and initializes the database. Subsequent starts are immediate.

§ 3

First launch, and one thing to know

Open http://localhost:3000 in a browser. You'll see a sign-up screen.

The first account is the admin

Open WebUI has no separate setup wizard for the owner — whoever creates the first account on a fresh install becomes the administrator, with full control over settings, models, and other users. If this is a shared machine, sign up first before anyone else gets to the URL. Subsequent sign-ups are regular users by default, and you can disable open registration entirely from the admin settings.

Once you're in, the interface is a chat window. It will tell you there are no models configured yet — that's expected. Before connecting a model, two settings are worth visiting:

Admin Panel → Settings → General: Disable "Enable New Sign Ups" if this is a single-user setup, or set it to require admin approval for a small-team setup.
Admin Panel → Settings → Interface: The defaults are fine. Worth knowing this is where you'd later turn on titles, follow-up suggestions, or tool calling in the chat.

Power user: where your data lives

Everything Open WebUI persists — accounts, conversation history, uploaded files, RAG document collections, model configurations, API keys — lives in the open-webui named Docker volume. On Linux this resolves to /var/lib/docker/volumes/open-webui/_data/. The application database is a single SQLite file at webui.db inside that directory.

For backups, snapshotting the volume directory while the container is stopped is the simplest reliable approach. Hot-copying the SQLite file usually works but is technically racy; sqlite3 webui.db ".backup '/path/out.db'" is the correct invocation if you don't want to stop the service.

Article 7 covers a real backup routine. For now, knowing where the data is means you know what to copy if you ever migrate machines.

§ 4

Connecting your first model

Open WebUI talks to two kinds of backends: native Ollama (article 3) and anything that speaks the OpenAI-compatible API. The second category is broad — OpenRouter, OpenAI itself, Anthropic via a proxy, llama.cpp's server, vLLM, LM Studio, anything else following the same wire format.

The fastest path to a working chat is an OpenRouter key. OpenRouter is an aggregator: one account, one billing relationship, and access to most current models behind a single endpoint. It's also the path I recommend to readers who want to evaluate the interface before committing to local inference.

In the admin panel:

Go to Admin Panel → Settings → Connections.
Under OpenAI API, click the + to add a connection.
Set the URL to https://openrouter.ai/api/v1 and paste in an OpenRouter key.
Save. Open WebUI will fetch the model list — a few hundred entries, including current frontier models alongside DeepSeek V4, Gemini 3 Flash, and a long tail of open-weights options.
Back in the main chat, the model picker now lists everything OpenRouter exposes. Filter or favorite the handful you'll actually use; the full list is overwhelming and you don't need it.

At this point you have a working private AI chat interface that you control. Conversations stay on your machine. The model lives on someone else's, but the interface, the history, the settings — all yours.

For a team or organization

Open WebUI's account model — admin plus user roles, optional admin approval on signup, per-user API key control — is enough for a small group running off a single deployment. There are no license fees; the project's license is permissive (with some branding clauses worth a quick read for organizational use), so the cost is the hardware and whatever inference you route through.

What it doesn't provide is identity-provider integration in the box. SAML or OIDC sign-on is on the roadmap and partially supported via OAuth providers in current builds, but a team that needs SSO from day one should treat that as a question to verify, not a given. For a small group, the password-account model is workable; for a larger one, the right answer might be putting Cloudflare Access in front of it (article 6) and treating Open WebUI's own auth as a second layer.

§ 5

The fork: where do you go from here?

Open WebUI is running. The model is wherever you point it. Three reasonable next steps, depending on what you want the stack to be.

Path A · Stay on cloud

You already finished

If your OpenRouter connection works and the model quality is what you want, you have a working stack. Conversations are local; inference is remote but routed through endpoints you can audit. Reasonable place to live.

Use OpenRouter's Zero Data Retention filter if data handling matters — it restricts your traffic to providers contractually obligated not to retain it.

Cost: pay-per-token · Setup time remaining: zero · Privacy: cloud-policy bound

Path B · Add Ollama

Local inference, easy mode

Run a model on your own hardware. Ollama handles model downloads, quantization, GPU allocation, and exposes an API that Open WebUI connects to automatically. The fastest route to everything stays on your machine.

Suits anyone with a GPU that has at least 8 GB of VRAM, or a recent Apple Silicon Mac with 16 GB+ of unified memory. Smaller models also run on CPU, slowly.

Cost: hardware you may already own · Setup time: ~30 min · Privacy: nothing leaves the box

Path C · llama.cpp directly

Local inference, more control

llama.cpp is the inference engine Ollama wraps. Talking to it directly costs more setup effort and gets you finer control over context length, batch size, KV cache quantization, and the long tail of flags that matter when you're trying to fit a larger model on tight VRAM.

The right choice if you've outgrown Ollama's defaults — or if you're running multiple models with on-demand swapping (I use llama-swap for this).

Cost: same hardware, more time · Setup time: ~1–2 hours · Privacy: nothing leaves the box

Path A is genuinely done. Path B is the next article; path C follows. Most readers should follow path B first and only move to C when something in Ollama actively gets in the way — that's how I got there.

§ 6

Where this fits

The Compose file you just created is the first plank of a longer build. The articles that follow add services to the same file — Ollama for local models, SearXNG for web-aware conversations, Cloudflare Tunnel for remote access, ComfyUI for image generation. Each one is additive: the same project directory, the same docker compose up -d, one more block in the YAML.

1 Why self-host? Read

2 Open WebUI: your AI interface Here

3 Ollama: local models, easy setup Read

4 llama.cpp: higher performance, more control Read

5 SearXNG: web-aware conversations Read

6 Remote access: Cloudflare Tunnel Read

7 Security and privacy Read

8 Image generation Read

9 Open WebUI in practice Read