Skip to content

Provider Pipeline

Streams an LLM response from a provider API to one or more Discord messages, handling all text processing, delivery timing, and stop signals between the provider’s HTTP stream and Discord’s message API.

The pipeline is entered from tool-loop stage 01 — streamOnce via LLMProvider.streamToDiscord(). That facade method constructs a StreamAdapter and a StreamConfig, then hands both to StreamOrchestrator.streamToDiscord(), which drives the remaining stages.

  1. README.md — this file (pipeline overview and entry-point wiring)
  2. 01-context-assembly.md — adapter translates StructuredContextItem[] into provider-native format
  3. 02-raw-chunk-generation.md — HTTP stream opens; RawStreamChunk objects are yielded
  4. 03-chunk-normalization.mdprocessChunk converts RawStreamChunkProcessedChunk
  5. 04-orchestrator-state-machine.mdexecuteStream drives the for-await loop and routes chunk types
  6. 05-buffer-management.mdStreamBufferFlusher accumulates text and flushes at boundaries
  7. 06-segment-normalization.mdStreamSegmentProcessor cleans text and resolves Discord-specific concerns
  8. 07-discord-delivery.mdStreamMessageDelivery + StreamUiUpdater send messages to Discord
tool-loop pipeline ─► LLMProvider.streamToDiscord()
│ (provider facade: builds StreamConfig + StreamAdapter)
[Stage 1] startStream — Context assembly
│ contextItems → provider-native contents + system instruction
│ tools, function history, stop strings, config options
[Stage 2] startStream — Raw chunk generation
│ HTTP stream open → yield RawStreamChunk per token delivery
│ speaker guard holdback, text deduplication (Google)
[Stage 3] processChunk — Chunk normalization
│ RawStreamChunk → ProcessedChunk { type, content, functionCall,
│ error, thoughts, metadata }
[Stage 4] executeStream — Orchestrator state machine
for-await loop ──► checks stop/abort/timeout per chunk
┌────┴────────────────────────────────────────────┐
│ type="text" │ type="function_call" │ type="done"
▼ ▼ ▼
[Stage 5] flush buffer → return record terminal
processTextChunk { status: "function_call" } metadata; continue
Buffer management
[Stage 6]
sendBufferSegment
Segment normalization
(clean, mentions, guard)
[Stage 7]
sendSegment →
sendSinglePayload
Discord delivery
(webhook / channel,
typing simulation)
StreamResult
{ status, accumulatedText,
thoughtLog, detailsContent }
tool-loop pipeline
FileStageCode symbolOwns
01-context-assembly.md1BaseStreamAdapter.startStream (setup)Provider-native request construction
02-raw-chunk-generation.md2BaseStreamAdapter.startStream (generator)HTTP streaming + provider-specific pre-processing
03-chunk-normalization.md3BaseStreamAdapter.processChunkRawStreamChunkProcessedChunk conversion
04-orchestrator-state-machine.md4StreamOrchestrator.executeStreamChunk routing, stop signals, timeout, StreamResult assembly
05-buffer-management.md5StreamBufferFlusher.processTextChunkText accumulation, semantic block detection, boundary flush
06-segment-normalization.md6StreamSegmentProcessor.sendBufferSegmentLLM output cleaning, mention resolution, speaker guard, prefill
07-discord-delivery.md7StreamMessageDelivery.sendSegment + StreamUiUpdater.sendSinglePayloadDiscord API calls, typing simulation, webhook routing

LLMProvider.streamToDiscord() is defined on each provider class (e.g., GoogleProvider, OpenrouterProvider). The method constructs a provider-specific StreamAdapter but immediately delegates to the universal StreamOrchestrator. Stages 1–3 are therefore provider-owned (each adapter handles its own API format); stages 4–7 are orchestrator-owned and provider-agnostic.

The stop registry (src/utils/discord/stream/stopRequests.ts) is a per-channel map checked at every iteration of the stage 4 orchestrator loop. Two stop modes exist:

  • User stop (status: "stopped_by_user") — /stop command; pending buffer is flushed before returning.
  • Follow-up interrupt (status: "follow_up_interrupt") — a new user message arrived; buffer is discarded and the pipeline exits immediately to allow the chat pipeline to re-run.

Controlled by HumanizerDegree (from TomoriState.config):

DegreeModeBehavior
NONE (0)AggregatedText is queued until a tool/final boundary, then sent in one batch
LOW/MEDIUM (1–2)StreamingEach segment is sent as it flushes; typing simulation runs between messages
HEAVY (3)Streaming + humanizeLike degree 1–2 but humanizeString() applies additional noise

When StreamContext.webhook and StreamContext.personaUsername are set (alter persona mode), stage 7 routes all Discord sends through sendWebhookMessageWithIdentity() so the message appears with the persona’s name and avatar. The first message in an alter response that also has a replyToMessage context gets a separate reply-notice via sendWebhookReplyNotice() before the main content send.

<think>…</think> blocks in the streamed text are silently captured into state.thinkBlockBuffer (stage 5) rather than sent to Discord. At stream end, stage 4 assembles these into StreamResult.thoughtLog for the thought-log embed that stage 04 of the tool-loop pipeline emits to a dedicated channel.

<details>…</details> blocks are captured into state.detailsBlockBuffer (stage 5) and routed to StreamResult.detailsContent. Stage 04 of the tool-loop pipeline writes detailsContent to the short-term memory cache separately from accumulatedText.