How does the auto-compact mechanism trigger when context grows near the limit, fork a summary query, and write a compact boundary marker the next turn can resume from

en
88.9% sentence pass·16/18 cited·16/18 citations valid·95 fn·0 dec·1413 sem
Context & Prompt SystemApplication Bootstrap & EntrypointsAPI Client & Model ResolutionMessage Rendering

Auto-compact trigger and resume flow

Overview

Auto-compact fires inside the main turn loop when token usage approaches the context ceiling, forks a summary request, and replaces history with a compact boundary marker so the next turn resumes from the summary .

Steps

  1. Each turn iteration of the main loop projects the history window and tracks token usage via autoCompactTracking, running snip and microcompact before evaluating whether a full compact is needed .
  2. When threshold logic decides to compact (or the user invokes /compact), the flow enters compactConversation, which first counts pre-compact tokens and runs PreCompact hooks with trigger: 'auto' .
  3. A dedicated summary user message is built from getCompactPrompt(customInstructions) and dispatched via streamCompactSummary, which optionally forks an agent that reuses the main conversation's prompt cache prefix to keep the summary query cheap .
  4. If the summary request itself returns PROMPT_TOO_LONG, the loop truncates the oldest API-round groups with truncateHeadForPTLRetry and retries up to MAX_PTL_RETRIES before giving up .
  5. On success, the pre-compact read-file cache is snapshotted, context.readFileState and nested memory paths are cleared, and post-compact attachments (files, agent, plan mode, skills, deferred-tools delta, MCP instructions delta) are rebuilt so the next turn has fresh context .
  6. The function returns a CompactionResult whose boundaryMarker is a SystemMessage plus summaryMessages — this is the compact boundary the next turn resumes from .
  7. Back in the loop, getMessagesAfterCompactBoundary(messages) is called at the top of each iteration, so subsequent turns read only messages after the boundary marker, effectively resuming from the summary .
  8. For the /compact command path, call in commands/compact/compact.ts first tries trySessionMemoryCompaction (session-memory route), and only falls through to compactConversation when session memory is absent or empty .
  9. The session-memory fast path checks shouldUseSessionMemoryCompaction, waits for in-progress extraction, reads the stored session memory file, and returns its own CompactionResult if content is present, otherwise yields null to let the legacy path run .
  10. Partial compaction (/compact with a pivot) uses partialCompactConversation, which slices messages around the pivot, strips stale compact boundaries in the 'up_to' direction, and runs the same PTL-retry + attachment-rebuild logic .

State touched

Decisions

No design decisions are present in the whitelist for this flow .

Note: the whitelist contains zero decision: tokens, so the *why* behind threshold choice, cache-prefix sharing, and PTL-retry count cannot be cited here .