How Adaptive Recall Works
Adaptive Recall is a patent-pending memory system built on cognitive science research and machine learning. Every component described below is active in every query, working together to deliver retrieval that improves over time.
When you query Adaptive Recall, your request is not handled by a single search method. Four retrieval strategies execute in parallel, each approaching the query from a different angle. The results are merged, deduplicated, and reranked into a single response.
The four strategies are vector similarity (semantic meaning), temporal recency (what was stored or accessed recently), full-text keyword search (exact term matching through FTS), and knowledge graph traversal (finding memories through entity connections rather than text similarity).
Strategy activation is itself adaptive. Vector and temporal search are always active. Keyword search activates once your memory store reaches 50 entries, because full-text indexing becomes meaningful at that scale. Graph search activates at 30 extracted entities, when the knowledge graph has enough structure to contribute useful results.
An ML model can learn to predict which strategies will perform best for each query, based on historical data. The model trains automatically when enough query logs have accumulated, and its predictions go through the same evidence gate that governs all parameter changes in the system.
- Final reranking blends scores: 60% rerank quality, 40% original composite score
- Strategy performance is tracked per query, feeding back into the ML training pipeline
- The system logs which strategy found the top result, building a dataset of strategy effectiveness
Adaptive Multi-Strategy Retrieval
After retrieval strategies return their results, every memory goes through a scoring layer based on ACT-R, a cognitive architecture backed by over 30 years of research into how human memory works. This is not a marketing label. ACT-R's activation equations model the actual mechanisms that govern recall in the human brain.
Each memory has a base-level activation computed from two factors: how recently it was accessed (recency) and how many times it has been accessed (frequency). A memory retrieved yesterday scores differently from one retrieved a month ago, even if the cosine similarity is identical. This matches how human memory works. Things you interacted with recently come to mind more easily.
Spreading activation flows through entity connections in the knowledge graph. If your query mentions "authentication" and a memory about "JWT tokens" is connected to "authentication" through the entity graph, that memory receives an activation boost even if the text itself does not mention your exact query terms.
Confidence weighting layers on top of activation. Every memory has a confidence score that evolves over time through the consolidation process. Corroborating evidence pushes confidence up. Detected contradictions push it down. The scoring formula weights higher-confidence memories more heavily, because they have been validated by the system.
The complete scoring formula combines cosine similarity, ACT-R base-level activation, spreading activation through entities, confidence weighting, and temporal recency. This is why the same query can return different rankings over time. The system learns from usage patterns and adjusts accordingly.
Cognitive Scoring
Every memory you store is automatically analyzed for entities and relationships. The system builds a knowledge graph that grows with your data, connecting concepts, people, technologies, projects, and places across your entire memory store.
Entity extraction uses a two-stage pipeline designed for both accuracy and efficiency. First, an ML model predicts which memories contain extractable entities. This is a fast classification step that filters out memories with no entity content. Then, only the filtered memories are sent to an LLM for full extraction of entity names, types, and relationships. This avoids wasting LLM calls on memories that contain no entities while maintaining high extraction quality.
Extracted entities are typed as person, technology, concept, project, or place. Relationships between entities include uses, related_to, part_of, and co_occurs. These relationship types capture the most common ways that entities connect in technical and professional knowledge.
The knowledge graph is not just a visualization feature. It is an active retrieval strategy. When you query for information, the system identifies entities in your query, traverses their connections in the graph, and retrieves memories that are linked through entity relationships. This finds relevant information that pure text similarity would miss.
Graph data is queryable through the API. You can explore entities, see their connections, and traverse relationship hops to understand how your knowledge is structured.
Knowledge Graph
Memories in Adaptive Recall are not static records. They progress through lifecycle stages that model how real knowledge behaves: new information needs validation, frequently accessed knowledge becomes reliable, and unused information gradually fades.
Every memory starts in the "fresh" stage. Once it has been accessed at least once through a query, it transitions to "active." If an active memory goes unaccessed for seven days, it moves to "fading." After fourteen days without access, it is archived. Archived memories are still searchable if you explicitly include them, but they do not appear in standard results.
High-confidence memories are protected from fading. When a memory's confidence score reaches 8.0 or above (on a 1-10 scale), it is considered validated knowledge and will remain active regardless of access patterns. This prevents well-established facts from disappearing just because you have not queried for them recently.
Background consolidation runs periodically, performing three operations. Deduplication identifies memories with overlapping content (above a configurable similarity threshold) and either merges or removes duplicates. Contradiction detection finds memories with similar topics but different information and uses LLM analysis to determine whether they agree, contradict, or discuss different aspects of the same topic. Confidence evaluation adjusts confidence scores based on corroborating or contradicting evidence from other memories.
The consolidation mode itself adapts. When the system detects high curiosity (many queries returning low-quality results), consolidation shifts toward synthesis mode, focusing on building connections. When curiosity is low, it shifts toward cleanup mode, focusing on deduplication and archival.
Memory Lifecycle
Adaptive Recall does not require manual tuning. The system trains machine learning models on your usage data and tunes its own parameters, with every change validated against real performance data before it takes effect.
The evidence gate is the core mechanism. When the ML pipeline proposes a parameter change (for example, adjusting the ACT-R decay rate), the system does not apply it immediately. Instead, it simulates the proposed change against your actual query history. Each past query is replayed with the proposed parameter, and the results are compared to what the current parameters produced. A sign test at the p < 0.05 significance level determines whether the change produces statistically better results. Only then is the parameter updated.
This prevents the system from drifting on noise. Random fluctuations in query patterns cannot cause parameter changes because they will not pass the significance threshold. Real improvements in retrieval quality will.
The ML training pipeline auto-triggers when data thresholds are met. The entity extraction model trains at 50 extracted entities. Strategy weight optimization and activation parameter tuning trigger at 200 query log entries. The query expansion model trains at 100 entries. Training runs in the background and does not affect query latency.
Self-verification is a separate process that monitors retrieval quality over time. The system periodically replays a sample of past queries and measures how much the current results overlap with the original results. If overlap drops below 70%, it raises an alert. This catches quality degradation that might otherwise go unnoticed, whether from parameter drift, data corruption, or changes in usage patterns.
Curiosity scoring tracks knowledge gaps. The system analyzes recent query results and measures how well each query was served. Queries that consistently return low-similarity results indicate areas where the memory store lacks relevant information. These gaps are surfaced through the status endpoint, giving you visibility into what your system does not know.
Self-Improving System
Adaptive Recall supports two protocols. MCP (Model Context Protocol) for direct integration with Claude Code and other MCP-compatible CLI tools, and standard HTTP REST for any application that can make HTTP requests. Both protocols expose the same eight tools with identical behavior.
The eight tools cover everything you need for memory management:
- store creates a new memory with automatic embedding generation and entity extraction
- recall searches memories with multi-strategy retrieval and cognitive scoring
- update modifies fields on an existing memory, re-embedding if content changes
- forget removes a memory by ID or by finding the best match for a query
- graph explores the knowledge graph, traversing entity relationships
- status returns system health, memory counts, confidence distribution, and knowledge gaps
- snapshot returns a formatted overview of stored memories, organized by type
- feedback sends feedback directly to the Adaptive Recall developers
Authentication uses Bearer tokens. Each account receives an API key on signup. All requests require the key in the Authorization header. Rate limits are per-account and vary by plan tier.
Memories are typed into categories: general_knowledge for facts and observations, user_knowledge for information about people and preferences, callable_scripts for tool and script references, work_project for project tracking, and cross_reference for pointers to external information. Types affect lifecycle behavior and retrieval scoring.
The OpenAPI specification is available at /openapi.json for clients that support automatic API discovery.
API and Integration