02.11: Dialogue History
The actual recent message history as alternating user/model items. The bottom of the prompt, immediately above the LLM’s next response.
File: src/utils/text/context/dialogueHistory.ts:25-157
Mission
Section titled “Mission”Iterate simplifiedMessageHistory (built by the chat pipeline’s
buildSimplifiedHistory, which has already collapsed runs of consecutive
same-author pure-text messages into single entries) and append one or more
context items per message with three orthogonal concerns interleaved:
- Role mapping — persona-authored →
model; user impersonation flips the impersonated user →model; everyone else →user. - Media descriptor emission — decide only context budget: whether media
is inside the media window, whether counted images fit
MEDIA_IMAGE_MESSAGE_LIMIT, and whether duplicate images should be dropped. The builder records capability-neutralmediaDescriptorsinstead of deciding whether an image/video becomes a provider media part. The per-attempt resolver (mediaResolver.ts) later turns descriptors into final image/video parts,{image_analysis_tool}notices, plain blind-model notices, orincrease_media_contexthints. - Context-note injection — if
context_noteis configured, inject[System: ${note}]atcontext_note_depthmessages from the end of history.
Substantial — see signature in dialogueHistory.ts:25-44. Notable:
contextItems: StructuredContextItem[]— the in-progress list (mutated in place; this is the only contributor that doesn’t return new items)simplifiedMessageHistory: SimplifiedMessageForContext[]tomoriConfig(providesmessage_fetch_limit,humanizer_degree,context_note,context_note_depth)tomoriState(providescontext_noteandcontext_note_depth; media capability is intentionally not read here)mediaContextWindow: number | undefined— override; falls back tomemoryGuard.getMediaWindow()isUserImpersonation,impersonatedUserIdmessageIdMap— compact ID ↔ Discord message ID, populated as media hints emituncensorInputOptions,convertMentions
Output
Section titled “Output”Promise<void> — appends to contextItems in place. Each appended item
is tagged DIALOGUE_HISTORY (default in pushDialogueHistoryContextItem)
or CONTEXT_NOTE_INJECTION for the injected note.
Side effects
Section titled “Side effects”Per message:
- Role mapping computed from author type and impersonation flags.
- Persona user block handling happens before this stage in
buildSimplifiedHistory: activepersona_user_blockswithblock_type = 'block'replace that user’s recent live dialogue turns/direct media with a single[System: ...]block notice for the active persona (consecutive messages from the same blocked user collapse into one notice) and suppress reply annotations quoting those messages. The blocked user is still excluded from tool-intent scanning, voice transcription, and sprite priming (visibleRawMessages). Memories, reminders, documents, and generic references from other users are not redacted. - Media-window calculation —
effectiveMediaWindow = min(requested, message_fetch_limit);mediaWindowCutoff = totalMessages - effectiveMediaWindow. - Media descriptor emission:
- Filters
MEDIA_IMAGE_MESSAGE_LIMIT(env, default 3) most-recent messages that carry “counted” images (non-emoji, non-sticker). - Drops duplicate images that recur in a later in-window message
(
duplicateImageLastIndexlookup). - Adds per-message
mediaDescriptorscarrying URI, MIME type, registered media ID, media-window membership, andextendByfor older out-of-window media. Custom emoji images are not descriptors; they remain text via emoji normalization.
- Filters
- Budget-only media notes:
- Rendered-image-limit skips emit a capability-neutral
[System: N image(s) omitted due to rendered-image limit]note. - Duplicate images are dropped with logging only.
- Capability-specific notices are not emitted here.
resolveMediaForModelemits{image_analysis_tool}guidance, plain blind-model notices, andincrease_media_contexthints per generation attempt. - Intentional deviation from the pre-refactor behavior: out-of-window media
now produces a plain “outside the current media context window and cannot
be viewed” notice even for blind models. Blind notices still include the
media_Nhandle so non-vision tools that accept media references (for example img2img/inpaint/image-to-video) can target the source message. Previously that blind + out-of-window combination emitted no line, which hid the fact that media existed at all.
- Rendered-image-limit skips emit a capability-neutral
- Media attribution hint — when media is referenced from a reply or
forward,
[System: These images (Media IDs: X, Y) were sent by Z]. - Text part assembly —
${authorName}: ${content}prefix, mention conversion, humanizer transform (model items at HEAVY+), uncensor input transforms. - Copied-render webhook reconstruction — webhook usernames formatted as
SourcePersona (target)are attributed toSourcePersonafor role mapping, self-reply ownership, and reply routing, whileauthorNamepreserves the full visible label. The resulting dialogue line stays reversible asSourcePersona (target): content, so the model can repeat the same syntax. - Sender metadata — dialogue items carry hidden
sendermetadata (personaNamewhen available, otherwiseauthorName) so strict-chat media relocation can attribute model-role images without parsing the visible{Name}:text prefix. - Detached system parts — system hints that should not be merged with
the message text are split into a separate
user-role item viapushDialogueHistoryContextItem.
Context-note injection (once per build):
- If
context_noteis set, computescontextNoteTargetIndex = max(0, totalMessages - context_note_depth). - Injects
[System: ${context_note}]as auser-role item with tagCONTEXT_NOTE_INJECTIONat the target index (or at the end if the history is shorter than the depth).
Invariants
Section titled “Invariants”After this stage runs:
- For each message, exactly one or two items are appended:
- One combined item when the role is
userand media/text both exist - Two separated items (
usersystem parts +rolereal parts) when the role ismodeland detached system parts exist
- One combined item when the role is
- Counted images respect
MEDIA_IMAGE_MESSAGE_LIMIT— older counted images get a budget note instead of descriptors. - Duplicate images don’t appear twice; the last occurrence in the window is the one that renders.
mediaDescriptorsremain capability-neutral. They are not provider-ready image/video parts untilresolveMediaForModel(...)runs for a concrete attempt model.- Context note injects exactly once per build — either at the depth target or at the very end if history is shorter.
messageIdMap.register(...)is called for every media reference the LLM might ask about after resolution (soincrease_media_context,image_analysis_tool, and media-reference tools have stable IDs).
Configuration
Section titled “Configuration”| Env var | Default | Purpose |
|---|---|---|
MEDIA_IMAGE_MESSAGE_LIMIT | 3 | Max in-window messages that render counted images |
PERSONA_USER_BLOCK_CACHE_TTL_SECONDS | 60 | TTL for active persona user block lookups |
| Source | Field | Effect |
|---|---|---|
tomoriConfig | message_fetch_limit | Caps media window |
tomoriConfig | humanizer_degree | HEAVY+ applies humanizer to model items |
tomoriConfig | context_note, context_note_depth | Context-note injection |
tomoriConfig | uncensor_unicode_space_enabled, uncensor_sanitize_enabled | Drives uncensor transforms |
tomoriState | context_note, context_note_depth | Persona-level override of tomoriConfig values |
| Memory pressure | memoryGuard.getMediaWindow() | Dynamic media-window shrink under load |
Extension points
Section titled “Extension points”This is the biggest contributor by complexity, with multiple plugin-relevant seams:
| Surface | Plugin-relevance |
|---|---|
Media-window policy (effectiveMediaWindow, maxExtendBy) | Coupled to memoryGuard + message_fetch_limit. A plugin adding “always include all media” or “per-channel media budget” would extend the window calculation. |
| Media descriptor shape | New media kinds should add descriptor fields here and resolution behavior in mediaResolver.ts. |
MEDIA_IMAGE_MESSAGE_LIMIT policy | Hardcoded env var; a plugin adding “per-persona media limit” would extend the resolution. |
| Image-attribution hint format | Hardcoded English; localization would extend. → plugin plan candidate. |
| Humanizer + uncensor integration | Shared with sample dialogues (stage 10). |
| Context-note injection depth | Tomori-state can override tomoriConfig — a plugin adding “per-channel context note” would extend the resolution. → plugin plan candidate. |
pushDialogueHistoryContextItem (the only contributor that uses it) | The push utility wraps tag defaulting; if a plugin emits its own dialogue items it would use the same helper to stay consistent. |
A plugin extension for “alternate history rendering” (e.g. collapse-tool-calls, anonymize-user-content, summarize-old-messages) would most naturally take the form of a per-message pre-processor running before the role mapping + text/media emission. → plugin plan candidate.
Related docs
Section titled “Related docs”- History helpers (
history.ts): covered in native-assembly README. - Message-ID map: → no dedicated doc;
messageIdMap.tshelper only - Image-analysis tool: tool registry (→ tool-loop pipeline)
increase_media_contexttool: tool registry (same source)- Memory-pressure media-window shrinking:
→ no dedicated doc;
src/utils/security/rateLimiter.tshelper only - Humanizer transform: →
src/utils/text/processors/formatters.tshelper - Uncensor transform: →
src/utils/text/uncensor.tshelper