Stack Series · 5 of 9 Practical · May 2026

SearXNG: web-aware conversations

Give your chat interface the web without giving a third party your search history. A self-hosted metasearch engine that aggregates results from elsewhere, returns clean JSON to Open WebUI, and doesn't keep an account on you — with an honest section on the upstream rate limits that show up once agentic chat starts hammering the public engines.

Curtis Smith · OptiMoss.ai · part of the Stack Series

§ 0

What this is

SearXNG is a metasearch engine. It doesn't have an index of its own; it forwards your query to other engines — DuckDuckGo, Bing, Brave, Reddit, Hacker News, Hugging Face, whatever you've turned on — collects the results, de-duplicates them, and hands back one merged list. It runs as a small Python web app, ships as a Docker image, and exposes a JSON API that other software can call.

The point of running your own copy is that the upstream engines see queries from your server, not from an account that has any history with them. There's no SearXNG-side log either, by default. The cost is that you are now responsible for one more service, and that the upstream engines occasionally fight back — which the article gets to in section 5.

In an OptiMoss stack, SearXNG sits between Open WebUI and the open web. When you turn on web search in the chat, Open WebUI fans the model's queries out through SearXNG, fetches the top few results, and hands them to the model as context. The model writes an answer informed by what's on the web right now, not just what was in its training data a year ago.

§ 1

Why this fits in the stack

Open WebUI's web search feature supports a long list of backends: Brave, Tavily, Serper, Google PSE, Bing API, DuckDuckGo, and SearXNG among them. Most of the others want either an API key with a per-query quota or a contract with the provider. Two reasons SearXNG is the default I recommend:

No rate limit you have to budget around. Agentic chats can easily fire off ten or twenty queries in a single turn while a model is working through a multi-step research task. With a paid API, that's metered. With SearXNG, it's HTTP calls to a container on the same machine, and those queries reach the upstream engines without your account attached.
Nothing about you flows out. No API key tying queries back to your identity, no account history accumulating with a vendor, no terms of service to read. The queries still go to upstream engines, but as anonymous traffic from a random IP rather than authenticated traffic from your account.

The trade is that your IP is the one talking to those upstream engines, and a few of them notice when the same address asks a lot of questions quickly. Section 5 covers what happens then.

§ 2

Installing

One Compose service. Add it to the same docker-compose.yml the rest of the stack is in, alongside open-webui and whichever inference engine you chose.

SearXNG · Compose service builder

Host port

Where SearXNG answers on the host. Open WebUI reaches it over the Docker network and doesn't need a host port at all, but exposing one (bound to localhost) is useful for hitting the search UI directly in a browser to test queries. I use 8003 to keep it clear of common defaults.

Network exposure

SearXNG should not be reachable from outside the host — its limiter is off in the config below, on the assumption that only Open WebUI (and you) are talking to it.

Localhost onlyRecommended. Bind to 127.0.0.1; LAN-side devices cannot reach it. LAN-accessibleFor when you want to use the search UI from other devices on a trusted network. Turn the limiter back on if you do this.

Hardening

Drop all Linux capabilities except the three SearXNG actually needs. Good hygiene for a service that talks to the open web.

Drop unneeded capabilitiesRecommended. Leave defaultsSimpler config; slightly broader container surface.

Append to docker-compose.yml

Before the first start, the service needs a settings.yml in ./searxng-config/. The next section handles that. Then:

$ mkdir -p ~/openwebui/searxng-config
# save settings.yml from the next section into that directory
$ docker compose up -d searxng

§ 3

The settings file

SearXNG is configured through a single YAML file. The image ships sensible defaults; the file below overlays the three or four things that actually matter for using it as a backend rather than a public search site.

# searxng-config/settings.yml
use_default_settings: true

server:
  secret_key: "REPLACE_WITH_A_RANDOM_HEX_STRING"
  limiter: false        # no rate limiting — internal use only
  image_proxy: false

search:
  safe_search: 0
  default_lang: "en"
  formats:
    - html
    - json              # required for the Open WebUI API call

outgoing:
  request_timeout: 6.0

Three knobs to know:

secret_key — used to sign cookies and a few internal tokens. Generate one with openssl rand -hex 32 and paste it in. The default placeholder is rejected; the service will refuse to start without a real value.
limiter: false — SearXNG ships with a rate limiter intended to deter scraping if you put the UI on the public internet. For an internal backend on localhost, the limiter just gets in the way of bursty agentic queries. Leave it off only if you trust everything that can reach the port.
formats: [html, json] — the JSON format isn't on by default. Open WebUI uses the JSON endpoint, so this line is what makes the whole thing work. Without it you'll see a 400 from SearXNG and a cryptic timeout from Open WebUI.

Picking which upstream engines to query

The engines: block lets you enable or disable individual upstream sources. The default mix is broad — most major engines, several specialty sources, a few image and video options — and works fine on day one. Two changes I made and have kept:

Disabled Google and Bing's text/image/news/video variants. Both detect bot traffic aggressively and a SearXNG instance that leans on them ends up half-broken. The other engines fill the gap.
Enabled the Reddit, Hacker News, and Hugging Face engines. The first two surface community discussion that the major engines bury; the third turns SearXNG into a model browser the chat can call from inside a conversation.

The format is one block per engine:

engines:
  - name: google
    engine: google
    disabled: true
  - name: reddit
    engine: reddit
    disabled: false
  - name: huggingface
    engine: huggingface
    disabled: false

The engine reference lists everything available. Pick the half-dozen that suit your queries; the rest can stay off.

§ 4

Wiring to Open WebUI

The chat interface needs to know two things: that web search is on, and where the search server lives.

In Open WebUI: Admin Panel → Settings → Web Search.
Toggle Enable Web Search.
Set Web Search Engine to searxng.
Set the SearXNG Query URL to http://searxng:8080/search?q=<query> — that's the Docker service name searxng on the container's internal port 8080, not the host port from the Compose file. Open WebUI fills in the JSON format header itself.
Save. The toggle next to the chat input is now active.

Send a question that needs current information — "what shipped in Open WebUI this week" is a good test — with web search turned on. The chat picks up search results before the model writes its answer, and the citations show up in the response.

One per-model setting matters a lot here, and the default gets it wrong. In Workspace → Models → (your model) → Advanced Params, set Function Calling to Native. The default is prompt-based — Open WebUI injects an XML tool-use protocol into the system prompt and parses the output. It works on simple cases and falls down on the harder ones: missed calls, malformed arguments, no support for interleaving (the model can't think, search, read a result, then search again). Native uses the provider's actual tool API. On any model from the last year — Gemma 4, Qwen 3.6, GPT-5.5, Claude Opus 4.7 — web search starts behaving the way the marketing pages imply it always did. The setting is per-model; do it for every model you use with tools. The closer in this series, article 9, covers a handful of other Open WebUI defaults worth flipping.

If Open WebUI runs outside Docker

Use the host port instead: http://localhost:8003/search?q=<query> if Open WebUI is on the same machine; http://<host-ip>:8003/search?q=<query> from another machine on the LAN, and only if you bound the port LAN-side in the Compose file. Loosening the localhost binding has a real cost — see the warning below.

§ 5

The honest tradeoff: upstream rate limits

"No rate limits" is true for SearXNG's API. It isn't true for the upstream engines it queries. The major engines have bot detection, and if your IP starts producing queries that look automated — which agentic chat traffic genuinely is — some of them will throttle you or stop responding for a while.

In practice this shows up two ways. Either the chat returns an answer that's noticeably sparser than usual (one or two of the configured engines came back empty), or Open WebUI surfaces an error from SearXNG saying every engine timed out. The first is barely noticeable; the second is mostly cosmetic and goes away in a few minutes. The fix is structural rather than reactive: configure several engines so the loss of any one isn't fatal, and disable the ones that fight back hardest. Google and Bing are the obvious ones to drop from the default mix.

Don't expose this to the open internet

With the limiter off and no authentication, a public SearXNG instance is an open proxy that lets anyone burn your IP's reputation with upstream engines until those engines block you. Bind to localhost. If you need remote access for your own use, route it through Cloudflare Tunnel + Cloudflare Access (article 6), not by opening the port.

§ 6

What web-aware chat actually feels like

The straightforward case: ask the model something time-sensitive, click the web-search toggle, get an answer that includes a handful of citations from real sources. That's the part that justifies the install.

The case I didn't expect to love: dropping a URL into a chat and asking the model to summarize the article behind it. Open WebUI fetches the page, extracts the text, and passes it through. Combine that with web search and a competent local model and you get a research session that holds together — read this, find related sources, summarize, compare — without any of the steps leaving the machine you're sitting at.

The case that surprised me by being practical: scheduled chats. Open WebUI's Automations feature runs a saved prompt on a cron and posts the answer to the chat — a model can hit SearXNG every morning for the news beats I follow and have a briefing waiting by the time I sit down. No paid search API, no per-query meter, no extra service. Article 9 walks through the setup and a few other Open WebUI moves the documentation tends to bury.

For a team or organization

SearXNG behind Open WebUI is a defensible posture for "let employees ask AI questions that involve the open web" without sending every query to a vendor whose terms of service govern how those queries are retained. Two things to know before signing off on it:

Upstream traffic is still observable. Queries leave your network as plain HTTPS requests to engines like DuckDuckGo and Brave. They don't carry user identifiers, but a network observer can see what your egress IP is asking. For sensitive work, route the SearXNG container through whatever egress posture the rest of your traffic uses.
One IP, all queries. Everyone's chat traffic exits through the same public address. This is a feature for privacy (no per-user accounts upstream) and a liability for scale (more users means more chance of triggering bot detection). For small teams the math works; for hundreds of seats, plan for a fallback or a paid search API as a second option.

§ 7

Where this fits

The stack now has a model, a chat interface, and a way to reach the open web on demand. What remains is the operational layer — getting at the stack from outside the house, the security posture that goes with that, and a couple of additions (image generation, the briefing system) that show what the foundation enables. Article 6 starts with remote access through Cloudflare Tunnel.

1 Why self-host? Read

2 Open WebUI: your AI interface Read

3 Ollama: local models, easy setup Read

4 llama.cpp: higher performance, more control Read

5 SearXNG: web-aware conversations Here

6 Remote access: Cloudflare Tunnel Read

7 Security and privacy Read

8 Image generation Read

9 Open WebUI in practice Read