Ollama + Open WebUI¶
Status: Deployed ✅ Date: 2026-04-12 URL: ai.exzentcg.com
What is Ollama?¶
Ollama is a self-hosted AI model server that runs open-source LLMs locally. Open WebUI provides a ChatGPT-style web interface on top of it. Together they give the team a private AI assistant — data never leaves the server.
Deployment Details¶
| Property | Value |
|---|---|
| CT ID | 104 |
| Hostname | ollama |
| IP | 192.168.0.54 |
| Ollama port | 11434 (API) |
| Open WebUI port | 8080 (web interface) |
| Docker images | ollama/ollama:latest + ghcr.io/open-webui/open-webui:main |
| RAM | 8192 MB |
| Swap | 4096 MB |
| CPU | 6 cores |
| Disk | 50 GB |
| DNS | 1.1.1.1 |
| Public URL | https://ai.exzentcg.com |
| Auth | Cloudflare Access (owner-only, 24h session) |
| Model | Qwen 3.5 0.8B (~1GB) |
| Snapshot | initial-deploy |
Hardware Constraints¶
The Proxmox host is an HP EliteDesk 800 G5 Mini with: - Intel i7-9700 (8 cores, UHD 630 iGPU) - 16GB RAM total — only ~12GB available after other containers - 256GB NVMe
With 8GB allocated to this container, only small models (≤1B params) run comfortably. For 7B+ models, add a second 16GB DDR4 SODIMM (~$25) — the G5 has one empty slot.
| Model | Size | RAM needed | Status |
|---|---|---|---|
| Qwen 3.5 0.8B | 1GB | ~2GB | ✅ Installed |
| TinyLlama 1.1B | 637MB | ~2GB | Available to pull |
| Phi-3 mini 3.8B | 2.3GB | ~4GB | Fits but tight |
| Qwen 3.5 4B | 3.4GB | ~5GB | Borderline |
| Anything 7B+ | 4GB+ | 8GB+ | Needs RAM upgrade |
Deployment Log¶
Step 1 — Create LXC¶
pct create 104 local:vztmpl/debian-12-standard_12.12-1_amd64.tar.zst \
--hostname ollama \
--cores 6 \
--memory 8192 \
--swap 4096 \
--rootfs local-lvm:50 \
--net0 name=eth0,bridge=vmbr0,ip=192.168.0.54/24,gw=192.168.0.1,firewall=1 \
--nameserver 1.1.1.1 \
--features nesting=1 \
--onboot 1 \
--start 1
Step 2 — Install Docker + Deploy¶
apt update && apt upgrade -y && apt install curl -y
curl -fsSL https://get.docker.com | sh
mkdir -p /opt/ollama && cd /opt/ollama
/opt/ollama/docker-compose.yml:
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
ports:
- "192.168.0.54:11434:11434"
volumes:
- ollama_data:/root/.ollama
environment:
- OLLAMA_HOST=0.0.0.0
- OLLAMA_NUM_PARALLEL=2
- OLLAMA_MAX_LOADED_MODELS=1
- OLLAMA_KEEP_ALIVE=5m
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: unless-stopped
ports:
- "192.168.0.54:8080:8080"
volumes:
- webui_data:/app/backend/data
environment:
- OLLAMA_BASE_URL=http://ollama:11434
depends_on:
- ollama
volumes:
ollama_data:
webui_data:
docker compose up -d
docker exec ollama ollama pull qwen3.5:0.8b
Step 3 — Firewall¶
CT 104 firewall — /etc/pve/firewall/104.fw:
[OPTIONS]
enable: 1
[RULES]
IN ACCEPT -source +edge_gw -p tcp -dport 8080 # Only edge-gateway can reach Open WebUI
IN ACCEPT -source +admin_desktop -p tcp -dport 22 # SSH
IN ACCEPT -source +admin_desktop -p tcp -dport 8080 # Direct LAN access
IN ACCEPT -source +admin_desktop -p tcp -dport 11434 # Ollama API direct access
IN ACCEPT -source 192.168.0.52 -p tcp -dport 11434 # n8n can call Ollama API
IN ACCEPT -source 192.168.0.53 -p tcp -dport 11434 # Homepage widget access
IN ACCEPT -source 192.168.0.53 -p tcp -dport 8080 # Homepage status check
IN DROP
OUT ACCEPT -dest +router_gw -p udp -dport 53
OUT ACCEPT -dest +router_gw -p tcp -dport 53
OUT ACCEPT -p udp -dport 53
OUT DROP -dest +lan_subnet
OUT ACCEPT -p tcp -dport 443
Edge-gateway 101.fw — added: OUT ACCEPT -dest 192.168.0.54 -p tcp -dport 8080 (proxy to Open WebUI)
n8n 102.fw — added: OUT ACCEPT -dest 192.168.0.54 -p tcp -dport 11434 (n8n → Ollama API)
Homepage 103.fw — added: OUT ACCEPT -dest 192.168.0.54 -p tcp -dport 8080 (status check)
Step 4 — NPM Proxy Host¶
| Field | Value |
|---|---|
| Domain Names | ai.exzentcg.com |
| Scheme | http |
| Forward Hostname / IP | 192.168.0.54 |
| Forward Port | 8080 |
| Block Common Exploits | ✅ |
| Websockets Support | ✅ |
| SSL | None |
Step 5 — Cloudflare Tunnel Route¶
| Field | Value |
|---|---|
| Subdomain | ai |
| Domain | exzentcg.com |
| Service | http://localhost:80 |
| HTTP Host Header | ai.exzentcg.com |
Step 6 — Cloudflare Access¶
| Field | Value |
|---|---|
| Application | ollama |
| Subdomain | ai |
| Domain | exzentcg.com |
| Session | 24 hours |
| Policy | owner-only |
Step 7 — Verification¶
https://ai.exzentcg.com→ Cloudflare Access → Open WebUI ✅- Homepage dashboard shows Ollama with green status dot ✅
Step 8 — Snapshot¶
pct snapshot 104 initial-deploy --description "Ollama + Open WebUI with Qwen 3.5 0.8B"
Usage¶
Web UI¶
Visit https://ai.exzentcg.com, create an account (first user becomes admin), select Qwen 3.5 0.8B from the model dropdown, and chat.
Pull more models¶
pct enter 104
docker exec ollama ollama pull <model_name>
# Examples:
# docker exec ollama ollama pull tinyllama
# docker exec ollama ollama pull phi3:mini
# docker exec ollama ollama pull qwen3.5:4b
n8n integration¶
In n8n, use the HTTP Request node to call Ollama's API:
POST http://192.168.0.54:11434/api/generate
Content-Type: application/json
{
"model": "qwen3.5:0.8b",
"prompt": "Your prompt here",
"stream": false
}
Troubleshooting¶
| Problem | Cause | Fix |
|---|---|---|
| Open WebUI loads but no models listed | Ollama container not running or not connected | docker exec ollama ollama list — if empty, pull a model |
| Very slow responses | Model too large for available RAM, swapping | Use a smaller model or add RAM |
| 502 Bad Gateway | Containers down | pct enter 104 && cd /opt/ollama && docker compose up -d |
| Homepage status grey | CT 103 can't reach CT 104:8080 | Check both 103.fw outbound and 104.fw inbound rules |
| n8n can't reach Ollama API | Firewall blocking | Check 102.fw outbound and 104.fw inbound for port 11434 |