Memory Management¶
anchor provides a layered memory system that tracks conversation history, stores persistent facts, and integrates both into the context pipeline. This guide covers every component -- from the coordinator down to decay functions and garbage collection.
Architecture Overview¶
The memory system has three tiers:
- Conversation memory -- a token-budgeted window of recent turns (
SlidingWindowMemoryorSummaryBufferMemory). - Persistent facts -- long-term entries in a
MemoryEntryStore, managed throughMemoryManager. - Graph memory -- an optional entity-relationship graph (
SimpleGraphMemory) for structured knowledge.
MemoryManager sits on top, producing ContextItem objects for the pipeline.
MemoryManager
|-- conversation: SlidingWindowMemory | SummaryBufferMemory
|-- persistent_store: MemoryEntryStore (optional)
`-- get_context_items() --> ContextItem[]
MemoryManager -- The Coordinator¶
MemoryManager is the single entry point for adding messages and retrieving context.
from anchor import MemoryManager
manager = MemoryManager(conversation_tokens=4096)
manager.add_user_message("What is context engineering?")
manager.add_assistant_message("It is the practice of assembling...")
items = manager.get_context_items()
print(len(items)) # 2 context items from conversation history
Pass a custom conversation backend to switch strategies:
from anchor import MemoryManager, SummaryBufferMemory
def compact(turns):
return "; ".join(t.content[:40] for t in turns)
summary_mem = SummaryBufferMemory(max_tokens=2048, compact_fn=compact)
manager = MemoryManager(conversation_memory=summary_mem)
Tip
When conversation_memory is provided, the conversation_tokens, tokenizer, and on_evict parameters are ignored.
SlidingWindowMemory¶
A rolling window of conversation turns within a token budget. When a new turn would exceed the limit, oldest turns are evicted first.
from anchor import SlidingWindowMemory
window = SlidingWindowMemory(max_tokens=1024)
window.add_turn("user", "Hello!")
window.add_turn("assistant", "Hi there!")
print(window.total_tokens) # current token usage
Eviction Callback¶
React to evicted turns (log them, archive them):
def on_evict(evicted_turns):
for t in evicted_turns:
print(f"Evicted: {t.role}: {t.content[:50]}")
window = SlidingWindowMemory(max_tokens=512, on_evict=on_evict)
Custom Eviction Policy and Recency Scorer¶
from anchor import (
SlidingWindowMemory, ImportanceEviction, ExponentialRecencyScorer,
)
window = SlidingWindowMemory(
max_tokens=1024,
eviction_policy=ImportanceEviction(
importance_fn=lambda turn: len(turn.content) / 100.0
),
recency_scorer=ExponentialRecencyScorer(decay_rate=3.0),
)
SummaryBufferMemory¶
A two-tier memory: recent turns verbatim plus a running summary of evicted content. Exactly one of compact_fn or progressive_compact_fn must be provided.
Simple Compaction¶
from anchor import SummaryBufferMemory
def compact(turns):
return "Summary: " + "; ".join(t.content[:60] for t in turns)
mem = SummaryBufferMemory(max_tokens=1024, compact_fn=compact)
mem.add_message("user", "Tell me about Python.")
mem.add_message("assistant", "Python is a high-level language...")
print(mem.summary) # None until eviction occurs
Progressive Compaction¶
progressive_compact_fn also receives the previous summary, enabling incremental refinement:
from anchor import SummaryBufferMemory
def progressive(turns, previous_summary):
new_content = "; ".join(t.content[:40] for t in turns)
if previous_summary:
return f"{previous_summary} | {new_content}"
return new_content
mem = SummaryBufferMemory(max_tokens=512, progressive_compact_fn=progressive)
Note
If the compaction function raises an exception, the raw turn content is used as a fallback so evicted data is never lost.
SimpleGraphMemory¶
An in-memory directed graph for entity-relationship tracking without an external graph database.
from anchor import SimpleGraphMemory
graph = SimpleGraphMemory()
graph.add_entity("alice", {"type": "person", "role": "engineer"})
graph.add_entity("project-x", {"type": "project"})
graph.add_relationship("alice", "works_on", "project-x")
graph.add_entity("bob", {"type": "person"})
graph.add_relationship("bob", "works_on", "project-x")
graph.link_memory("alice", "mem-001")
graph.link_memory("project-x", "mem-002")
# BFS traversal: find related entities within 2 hops
related = graph.get_related_entities("alice", max_depth=2)
print(related) # ['project-x', 'bob']
# Collect memory IDs for entity and its neighborhood
memory_ids = graph.get_related_memory_ids("alice", max_depth=2)
print(memory_ids) # ['mem-001', 'mem-002']
Tip
add_relationship auto-creates nodes for source and target if they do not already exist. link_memory requires the entity to exist.
Eviction Policies¶
Three built-in policies implement the EvictionPolicy protocol.
FIFOEviction -- evicts oldest turns first (matches the default):
from anchor import FIFOEviction, SlidingWindowMemory
window = SlidingWindowMemory(max_tokens=1024, eviction_policy=FIFOEviction())
ImportanceEviction -- evicts lowest-scoring turns first:
from anchor import ImportanceEviction
policy = ImportanceEviction(importance_fn=lambda turn: len(turn.content) / 100.0)
PairedEviction -- evicts user+assistant pairs together so no orphaned questions remain:
from anchor import PairedEviction, SlidingWindowMemory
window = SlidingWindowMemory(max_tokens=1024, eviction_policy=PairedEviction())
Decay Functions and Recency Scorers¶
Decay functions compute retention scores for persistent memory entries. The garbage collector uses them to prune forgotten memories.
EbbinghausDecay -- forgetting curve R = e^(-t/S) where strength grows with access count:
from anchor import EbbinghausDecay
decay = EbbinghausDecay(base_strength=1.0, reinforcement_factor=0.5)
LinearDecay -- linear interpolation from 1.0 to 0.0. At half_life_hours the retention is 0.5:
Recency scorers control position-based scores within a sliding window:
from anchor import ExponentialRecencyScorer, LinearRecencyScorer
exp_scorer = ExponentialRecencyScorer(decay_rate=2.0)
print(exp_scorer.score(0, 10)) # ~0.0 (oldest)
print(exp_scorer.score(9, 10)) # 1.0 (newest)
lin_scorer = LinearRecencyScorer(min_score=0.5)
print(lin_scorer.score(0, 10)) # 0.5
print(lin_scorer.score(9, 10)) # 1.0
SimilarityConsolidator¶
Determines whether new memory entries should be added, merged, or skipped based on content hashing and cosine similarity.
import math
from anchor import SimilarityConsolidator, MemoryEntry
def embed_fn(text: str) -> list[float]:
return [math.sin(i + len(text)) for i in range(8)]
consolidator = SimilarityConsolidator(
embed_fn=embed_fn,
similarity_threshold=0.85,
)
existing = [MemoryEntry(content="User prefers dark mode")]
new_entries = [
MemoryEntry(content="User likes dark themes"), # similar -> UPDATE
MemoryEntry(content="User prefers dark mode"), # exact dup -> NONE
MemoryEntry(content="User works at Acme Corp"), # new -> ADD
]
results = consolidator.consolidate(new_entries, existing)
for action, entry in results:
print(action, entry.content if entry else "(skipped)")
Warning
The library never calls an LLM. You provide the embed_fn which can use any embedding provider (OpenAI, Cohere, local models, etc.).
MemoryGarbageCollector and GCStats¶
Prunes expired and decayed entries from a GarbageCollectableStore.
from anchor import MemoryGarbageCollector, EbbinghausDecay, InMemoryEntryStore
store = InMemoryEntryStore()
gc = MemoryGarbageCollector(store=store, decay=EbbinghausDecay())
stats = gc.collect(retention_threshold=0.1)
print(stats) # GCStats(applied: expired_pruned=0, decayed_pruned=0, ...)
# Dry run -- identify what would be pruned without deleting
stats = gc.collect(retention_threshold=0.1, dry_run=True)
print(stats.total_pruned)
The collector works in two phases:
- Expiry phase -- removes entries whose
is_expiredproperty isTrue. - Decay phase -- computes retention and removes entries below the threshold.
Both phases fire MemoryCallback hooks for observability.
Persistent Facts¶
MemoryManager manages persistent facts through a MemoryEntryStore. Content-hash deduplication prevents storing the same content twice.
from anchor import MemoryManager, MemoryType, InMemoryEntryStore
store = InMemoryEntryStore()
manager = MemoryManager(persistent_store=store)
# Add facts (returns existing entry if duplicate)
entry = manager.add_fact(
"User prefers dark mode",
tags=["preference"],
memory_type=MemoryType.SEMANTIC,
)
# Search, update, delete
results = manager.get_relevant_facts("theme preference", top_k=3)
manager.update_fact(entry.id, "User prefers dark mode with blue accent")
manager.delete_fact(entry.id)
all_facts = manager.get_all_facts()
Warning
Calling add_fact without a configured persistent_store raises StorageError. Always pass a MemoryEntryStore to the constructor.
MemoryCallback and CallbackExtractor¶
MemoryCallback is a protocol for observing memory lifecycle events. Implement only the methods you care about:
class MyCallback:
def on_eviction(self, turns, remaining_tokens):
print(f"Evicted {len(turns)} turns, {remaining_tokens} tokens left")
def on_decay_prune(self, pruned_entries, threshold):
print(f"Pruned {len(pruned_entries)} entries below {threshold}")
CallbackExtractor delegates memory extraction to a user function:
from anchor import CallbackExtractor, MemoryType
from anchor.models.memory import ConversationTurn
def my_extractor(turns):
return [
{"content": "User prefers dark mode", "tags": ["preference"]},
{"content": "User's name is Alice", "memory_type": "semantic"},
]
extractor = CallbackExtractor(extract_fn=my_extractor)
turns = [ConversationTurn(role="user", content="I prefer dark mode")]
entries = extractor.extract(turns)
print(entries[0].content) # "User prefers dark mode"
Complete Example¶
A memory-aware pipeline with summarization, persistent facts, and garbage collection:
from anchor import (
ContextPipeline, MemoryManager, SummaryBufferMemory,
MemoryGarbageCollector, EbbinghausDecay, InMemoryEntryStore,
QueryBundle,
)
store = InMemoryEntryStore()
def compact(turns):
return "Previously: " + "; ".join(t.content[:50] for t in turns)
conversation = SummaryBufferMemory(max_tokens=2048, compact_fn=compact)
manager = MemoryManager(conversation_memory=conversation, persistent_store=store)
manager.add_fact("User prefers Python over JavaScript", tags=["preference"])
manager.add_user_message("How do I sort a list in Python?")
manager.add_assistant_message("Use sorted() or list.sort().")
pipeline = (
ContextPipeline(max_tokens=4096)
.with_memory(manager)
.add_system_prompt("You are a helpful coding assistant.")
)
result = pipeline.build(QueryBundle(query_str="Show me an example"))
print(f"Context items: {len(result.window.items)}")
gc = MemoryGarbageCollector(store=store, decay=EbbinghausDecay())
stats = gc.collect(retention_threshold=0.1)
print(stats)
Next Steps¶
- Memory API Reference -- constructor signatures, parameter tables, and method details for every memory class.
- Pipeline Guide -- how memory integrates with the context pipeline.