Skip to content

STM 02: Summary Upgrade

LLM-initiated replacement of the crude conversation in the STM cache with a compact, LLM-authored summary.

Files:

  • UpdateShortTermMemoryToolsrc/tools/functionCalls/updateShortTermMemoryTool.ts
  • updateShortTermMemorySummarysrc/utils/cache/shortTermMemoryCache.ts:518-574

When the LLM calls the update_short_term_memory tool during the tool-loop, it provides a summary string that distills the current conversation into a compact, model-friendly representation. UpdateShortTermMemoryTool.execute() validates the input, extracts channel/server/persona metadata from the tool context, and calls updateShortTermMemorySummary(), which writes the summary string into both the user-scoped and server-scoped STM cache entries for this channel.

On the next turn, the context-build STM stage renders the summary field instead of the raw messages array, reducing token cost and giving the LLM a self-authored context rather than a verbose turn-by-turn log. Upgraded entries also enjoy a longer TTL (SHORT_TERM_MEMORY_SUMMARY_TTL_HOURS, default 24 h) compared to the crude conversation TTL (12 h default).

Silent operation — no embed or user-facing message is sent.

  • args.summary: string — the LLM-authored summary (max MAX_SUMMARY_LENGTH chars; default 1 500; truncated silently if over limit).
  • context: ToolContext — provides userId, channel.id, guildId, tomoriState.persona_id, tomoriState.persona_lineage_id.
  • context.streamContext.explicitLongTermMemoryIntent — if true, this tool is not offered (see intent gate in memory README).
  • context.streamContext.disableShortTermMemoryUpdate — if true, execution is blocked (per-turn deduplication guard).

Promise<ToolResult>{ success: true, message: "..." } on success; error result on validation failure or blocked execution.

  • STM cache summary field updated — both user-scoped and server-scoped entries for this (userId, channelId, personaId) gain or replace their summary string.
  • lastUpdated refreshed — the TTL clock restarts on the updated entries.
  • streamingContext.disableShortTermMemoryUpdate = true — set by the tool-loop after successful execution (prevents re-calling this tool in the same turn).
  • No cache invalidation — STM is in-process only; no downstream DB or TomoriState cache needs invalidating.

After a successful execute:

  • The STM cache entry for (userId, channelId, personaId) contains a non-empty summary field of at most MAX_SUMMARY_LENGTH characters.
  • Existing messages in the entry are preserved — the summary is written alongside them, not instead of them at the data layer. The context-build reader chooses summary over messages at render time.
  • At most one successful execution occurs per generation turn.
GuardCode location
explicitLongTermMemoryIntent flag is trueisAvailableForContext() at updateShortTermMemoryTool.ts:64; also re-checked in execute() at :81
disableShortTermMemoryUpdate flag is trueisAvailableForContext() at :69; also re-checked in execute() at :89
Provider is "novelai"isAvailableFor() at :48 — excluded due to token-budget constraints
SurfacePlugin-relevance
updateShortTermMemorySummary()Internal — summary write is a direct cache mutation; no plugin-relevant seam. The summary field replaces crude conversation globally for the channel key; there is no per-plugin namespace.
MAX_SUMMARY_LENGTHEnv-var configurable (SHORT_TERM_MEMORY_MAX_SUMMARY_LENGTH, default 1 500). Not a plugin seam.
Summary TTLEnv-var configurable (SHORT_TERM_MEMORY_SUMMARY_TTL_HOURS, default 24). Not a plugin seam.
SourceKey / Env varDefaultPurpose
Env varSHORT_TERM_MEMORY_MAX_SUMMARY_LENGTH1500Max summary length before truncation
Env varSHORT_TERM_MEMORY_SUMMARY_TTL_HOURS24TTL for entries that have a summary