Home » Memory Lifecycle Management » Schedule Consolidation

How to Schedule Background Memory Consolidation

Background consolidation runs on a recurring schedule to keep your memory store lean without requiring manual intervention. Setting it up requires choosing the right frequency for your ingestion rate, configuring resource limits so consolidation does not compete with live retrieval, handling concurrency so reads and writes do not conflict, and adding monitoring so you know when consolidation runs successfully and when it needs attention.

Before You Start

You should already have a working consolidation pipeline that can cluster, merge, and resolve contradictions in your memory store. If you have not built one yet, start with the consolidation pipeline guide. Scheduling adds automation on top of an existing pipeline, so the underlying merge logic needs to be reliable before you run it unattended.

If you are using Adaptive Recall, the reflect tool can be called on a schedule through any job runner. The tool handles clustering, merging, and contradiction resolution internally, so scheduling it is as simple as setting up a periodic API call.

Step-by-Step Implementation

Step 1: Choose the consolidation frequency.
The right frequency depends on how quickly your memory store accumulates new entries and how volatile your domain is. Low-volume systems that add fewer than 50 memories per day can consolidate weekly. Medium-volume systems adding 50 to 500 memories per day should consolidate every two to three days. High-volume systems adding more than 500 memories per day benefit from nightly consolidation. The goal is to run consolidation frequently enough that redundancy never builds up to the point where it noticeably affects retrieval quality, but not so frequently that you waste compute on stores with little new content to consolidate.

Step 2: Configure resource limits.
Consolidation can be compute-intensive because it involves pairwise comparisons within clusters, LLM calls for contradiction detection, and embedding regeneration for merged memories. Set limits to keep each run bounded. Process memories in batches of 500 to 1,000 rather than loading the entire store at once. Set a maximum runtime so a single consolidation run cannot consume resources indefinitely. Limit the number of LLM calls per run if contradiction detection uses an LLM, because this is typically the most expensive operation in the pipeline.

CONSOLIDATION_CONFIG = {
    'batch_size': 500,
    'max_runtime_seconds': 3600,
    'max_llm_calls': 200,
    'max_merges_per_run': 100,
    'similarity_threshold': 0.75,
    'entity_overlap_min': 2
}

Step 3: Handle concurrency with live retrieval.
Consolidation modifies memories in the store while retrieval operations may be reading those same memories. Without concurrency handling, a retrieval call could read a memory that is being merged, returning partial or inconsistent data. The simplest approach is snapshot isolation: at the start of a consolidation run, take a snapshot of the memory IDs to process and work on that fixed set. Merges are applied atomically at the end of each cluster's processing, so the active store transitions directly from the old state to the new state with no intermediate inconsistency.

def run_consolidation_batch(memory_store, config):
    # snapshot: get IDs at the start, work on this fixed set
    memory_ids = memory_store.list_ids(
        limit=config['batch_size'],
        order_by='last_consolidated_at',
        ascending=True
    )
    memories = [memory_store.get(mid) for mid in memory_ids]

    clusters = cluster_memories(memories)
    merges_done = 0

    for cluster in clusters:
        if merges_done >= config['max_merges_per_run']:
            break

        merged = process_cluster(cluster, config)
        if merged:
            # atomic swap: insert merged, delete sources
            memory_store.atomic_merge(
                insert=merged,
                delete=[m['id'] for m in cluster]
            )
            merges_done += 1

    return merges_done

Step 4: Build the scheduler.
Use whatever scheduling infrastructure your application already has. A cron job is the simplest option for self-hosted systems. Cloud platforms offer managed schedulers like AWS EventBridge, Google Cloud Scheduler, or Azure Timer Triggers. Task queue systems like Celery or Temporal can schedule recurring tasks with built-in retry logic. The scheduler should trigger the consolidation function at the configured frequency and handle failures gracefully by retrying with exponential backoff.

# cron example: run consolidation every night at 2 AM UTC
# 0 2 * * * /usr/bin/python3 /app/consolidate.py

# Python scheduler example
from apscheduler.schedulers.background import BackgroundScheduler

scheduler = BackgroundScheduler()
scheduler.add_job(
    run_consolidation_batch,
    trigger='cron',
    hour=2,
    minute=0,
    args=[memory_store, CONSOLIDATION_CONFIG],
    max_instances=1,  # prevent overlapping runs
    misfire_grace_time=3600
)
scheduler.start()

Prevent overlapping runs: If a consolidation run takes longer than the interval between runs, you get overlapping processes that can conflict with each other. Always configure your scheduler with max_instances=1 or equivalent locking to ensure only one consolidation process runs at a time.

Step 5: Add monitoring and alerting.
After each consolidation run, log key metrics: how many memories were processed, how many clusters were found, how many merges were performed, how many contradictions were resolved, how long the run took, and how many LLM calls were consumed. Set up alerts for failures (the run did not complete), unusually high merge counts (which may indicate a data quality issue upstream), and retrieval quality degradation (if benchmark queries return worse results after consolidation).

def log_consolidation_results(start_time, merges, contradictions,
                              memories_processed, llm_calls):
    duration = time.time() - start_time
    metrics = {
        'timestamp': time.time(),
        'duration_seconds': duration,
        'memories_processed': memories_processed,
        'merges_performed': merges,
        'contradictions_resolved': contradictions,
        'llm_calls_used': llm_calls
    }
    logger.info('consolidation_complete', extra=metrics)

    if merges > memories_processed * 0.5:
        alert('High merge ratio detected, check data quality')

Incremental vs Full Consolidation

Full consolidation processes every memory in the store, which is thorough but expensive. Incremental consolidation only processes memories that were created or modified since the last run. For most applications, incremental consolidation is sufficient and much faster. Track a last_consolidated_at timestamp on each memory and process only memories where this timestamp is older than the last run. Run full consolidation periodically (monthly or quarterly) to catch cross-cluster relationships that incremental runs might miss.

Adaptive Recall's Reflect Tool

In Adaptive Recall, you schedule consolidation by calling the reflect tool on a recurring basis. The tool handles clustering, merging, contradiction resolution, and metadata updates internally. Pass a scope parameter to control whether the run is incremental or full. The status tool reports consolidation metrics including the last run time, memories processed, and merges performed, so you can monitor results without building custom logging.

Schedule consolidation with a single API call. The reflect tool handles everything automatically.

Get Started Free

How to Schedule Background Memory Consolidation

Before You Start

Step-by-Step Implementation

Incremental vs Full Consolidation

Adaptive Recall's Reflect Tool

Related Articles