Skip to content

Local & Self-Hosted Endpoints

Any OpenAI-compatible server works out of the box using the /custom-endpoints command category. Popular options:

ServerNotes
OllamaEasiest local LLM setup; enable OpenAI-compat mode
KoboldCPPGGUF models; OpenAI-compat mode built in
LM StudioGUI-based; exposes a local /v1 server
vLLMHigh-throughput GPU serving
LiteLLMUnified proxy over many backends
ChatMockLocal OpenAI-compat bridge for Codex CLI

Configure via /custom-endpoints in Discord, pointing at your local endpoint URL (e.g. http://192.168.1.10:11434/v1).

TomoriBot ships a ready-to-use Anima v1 ComfyUI workflow for txt2img and img2img. Use /help custom-endpoint to learn how to create a TomoriBot-compatible ComfyUI workflow for images and videos as well.

Three reference FastAPI wrapper servers are included, each exposing a /synthesize endpoint that TomoriBot calls for native Discord voice messages. All of which support voice cloning

EngineFolderModelStrength
Chatterboxservers/tts/chatterbox/Chatterbox TurboEnglish, lightweight, expressive bracket tags
Qwen3-TTSservers/tts/qwen3tts/Qwen3-TTS 1.7B BaseLarge but accurate multilingual reference-audio cloning (RECOMMENDED)
IrodoriTTSservers/tts/irodoritts/Irodori-TTS 500M v2Japanese-focused reference-audio cloning, styles with emojis

Each folder contains a server.py and requirements.txt. Start the server, then register it in Discord with /provider custom-endpoints add (capability: speech). Upload a short reference audio clip via /speech voice-add and assign it to a persona with /speech voice-assign. The clip can be in any audio format (TomoriBot automatically converts it to mono WAV), but it is strongly recommended to use a 10-20 second clip with no background music.

ElevenLabs is also supported as a cloud TTS/STT option via /speech elevenlabs.

A reference WhisperX server is included for transcribing audio attachments sent to TomoriBot.

  • Server script: servers/stt/whisperx_server.py
  • Exposes the standard OpenAI /v1/audio/transcriptions endpoint shape
  • Compatible alternatives: whisper.cpp HTTP mode, KoboldCPP STT

Register via /custom-endpoints add (capability: transcription). Use /help transcription in Discord for a step-by-step setup guide.

TomoriBot can route her built-in web tools through your own self-hosted infrastructure to avoid public API limits and improve parsing quality.

SidecarToolPurposeGuide
SearXNGweb_searchPrivacy-respecting metasearch engine proxy to avoid rate limitsSetup Guide
Crawl4AIfetch_urlBrowser-rendered markdown extraction for JS-heavy sitesSetup Guide

Instead of starting sidecar services manually, use bun launch which starts the requested sidecars, waits for them to be ready, then launches the bot in watch mode automatically:

Terminal window
# Bot only, identical to bun run dev
bun launch
# With SearXNG and Crawl4AI Docker sidecars
bun launch --searxng --crawl4ai
# With a local TTS server (venv must be set up first — see docs/integrations/voice/tts/)
bun launch --qwen3tts
# See all available flags
bun launch --help

Available flags: --searxng, --crawl4ai, --qwen3tts, --chatterbox, --irodoritts, --whisperx

Docker sidecars (--searxng, --crawl4ai) are created on first run and reused on subsequent runs. Python TTS/STT sidecars require their venv to be set up once beforehand; see the individual setup guides in docs/integrations/voice/.

Ctrl+C stops the bot and any Python sidecar processes. Docker containers are intentionally left running — stop them manually with docker stop searxng / docker stop crawl4ai when done.