Transcription Integration
TomoriBot treats speech-to-text as the transcription custom endpoint capability. Audio attachments are transcribed in the background and added to conversation context when an active transcription endpoint exists.
Visible transcript posting is separate. /speech transcripts only controls whether voice-message transcripts are posted in chat; it does not enable or disable background STT.
Quick Flow
Section titled “Quick Flow”- Start the WhisperX reference server from
servers/stt/. - Register it with
/provider custom-endpoint addusing capabilitytranscriptionand api styleopenai-compatible-transcription. - Select it with
/model transcription.
ElevenLabs users should use /speech elevenlabs; it registers transcription alongside speech.