STM 02: Summary Upgrade
LLM-initiated replacement of the crude conversation in the STM cache with a compact, LLM-authored summary.
Files:
UpdateShortTermMemoryTool—src/tools/functionCalls/updateShortTermMemoryTool.tsupdateShortTermMemorySummary—src/utils/cache/shortTermMemoryCache.ts:518-574
Mission
Section titled “Mission”When the LLM calls the update_short_term_memory tool during the tool-loop,
it provides a summary string that distills the current conversation into a
compact, model-friendly representation. UpdateShortTermMemoryTool.execute()
validates the input, extracts channel/server/persona metadata from the tool
context, and calls updateShortTermMemorySummary(), which writes the summary
string into both the user-scoped and server-scoped STM cache entries for this
channel.
On the next turn, the context-build STM stage renders the summary field
instead of the raw messages array, reducing token cost and giving the LLM
a self-authored context rather than a verbose turn-by-turn log. Upgraded
entries also enjoy a longer TTL (SHORT_TERM_MEMORY_SUMMARY_TTL_HOURS,
default 24 h) compared to the crude conversation TTL (12 h default).
Silent operation — no embed or user-facing message is sent.
args.summary: string— the LLM-authored summary (maxMAX_SUMMARY_LENGTHchars; default 1 500; truncated silently if over limit).context: ToolContext— providesuserId,channel.id,guildId,tomoriState.persona_id,tomoriState.persona_lineage_id.context.streamContext.explicitLongTermMemoryIntent— iftrue, this tool is not offered (see intent gate in memory README).context.streamContext.disableShortTermMemoryUpdate— iftrue, execution is blocked (per-turn deduplication guard).
Output
Section titled “Output”Promise<ToolResult> — { success: true, message: "..." } on success; error
result on validation failure or blocked execution.
Side effects
Section titled “Side effects”- STM cache
summaryfield updated — both user-scoped and server-scoped entries for this(userId, channelId, personaId)gain or replace theirsummarystring. lastUpdatedrefreshed — the TTL clock restarts on the updated entries.streamingContext.disableShortTermMemoryUpdate = true— set by the tool-loop after successful execution (prevents re-calling this tool in the same turn).- No cache invalidation — STM is in-process only; no downstream DB or TomoriState cache needs invalidating.
Invariants
Section titled “Invariants”After a successful execute:
- The STM cache entry for
(userId, channelId, personaId)contains a non-emptysummaryfield of at mostMAX_SUMMARY_LENGTHcharacters. - Existing
messagesin the entry are preserved — the summary is written alongside them, not instead of them at the data layer. The context-build reader chooses summary over messages at render time. - At most one successful execution occurs per generation turn.
Guards (all block execution)
Section titled “Guards (all block execution)”| Guard | Code location |
|---|---|
explicitLongTermMemoryIntent flag is true | isAvailableForContext() at updateShortTermMemoryTool.ts:64; also re-checked in execute() at :81 |
disableShortTermMemoryUpdate flag is true | isAvailableForContext() at :69; also re-checked in execute() at :89 |
Provider is "novelai" | isAvailableFor() at :48 — excluded due to token-budget constraints |
Extension points
Section titled “Extension points”| Surface | Plugin-relevance |
|---|---|
updateShortTermMemorySummary() | Internal — summary write is a direct cache mutation; no plugin-relevant seam. The summary field replaces crude conversation globally for the channel key; there is no per-plugin namespace. |
MAX_SUMMARY_LENGTH | Env-var configurable (SHORT_TERM_MEMORY_MAX_SUMMARY_LENGTH, default 1 500). Not a plugin seam. |
| Summary TTL | Env-var configurable (SHORT_TERM_MEMORY_SUMMARY_TTL_HOURS, default 24). Not a plugin seam. |
Configuration
Section titled “Configuration”| Source | Key / Env var | Default | Purpose |
|---|---|---|---|
| Env var | SHORT_TERM_MEMORY_MAX_SUMMARY_LENGTH | 1500 | Max summary length before truncation |
| Env var | SHORT_TERM_MEMORY_SUMMARY_TTL_HOURS | 24 | TTL for entries that have a summary |
Related docs
Section titled “Related docs”- Intent gate that may suppress this tool: → memory pipeline README — intent detection gate
- STM passive capture that this stage supersedes for context rendering: →
stm/01-passive-capture.md - Tool-loop that invokes this stage: → tool-loop Stage 02
executeToolCall - Read side that prefers
summaryovermessages: → context-build STM stage