How to Combine Recency and Relevance in Ranking
Before You Start
You need two inputs for each candidate: a relevance score (from vector similarity, cross-encoder, or similar) and a timestamp (creation date, last update, or last access time). If you are using Adaptive Recall, base-level activation already captures recency along with access frequency, providing a richer signal than simple timestamp-based recency. This guide covers both approaches so you can choose the right one for your system.
Step-by-Step Implementation
Start with whatever relevance scoring you already have. Vector cosine similarity typically returns values between 0 and 1. Cross-encoder scores might be on a different scale and need normalization. The key requirement is that all candidates have a relevance score on the same scale, so you can blend it meaningfully with the recency score.
def normalize_relevance(similarity_score):
# cosine similarity is already 0-1 for normalized vectors
return max(0.0, min(1.0, similarity_score))The simplest recency score is a function of time elapsed since the memory was created or last updated. Newer memories score higher, older memories score lower. The challenge is choosing the right function shape: how fast should the score decrease, and should it ever reach zero? Three common approaches are linear decay, exponential decay, and power-law decay.
import time
import math
def recency_linear(timestamp, half_life_days=30):
age_days = (time.time() - timestamp) / 86400
score = max(0.0, 1.0 - (age_days / (2 * half_life_days)))
return score
def recency_exponential(timestamp, half_life_days=30):
age_days = (time.time() - timestamp) / 86400
return math.exp(-0.693 * age_days / half_life_days)
def recency_power_law(timestamp, decay=0.5):
age_seconds = max(time.time() - timestamp, 1.0)
# normalize against a 1-day reference point
return (86400 / age_seconds) ** decayLinear decay drops steadily to zero at twice the half-life. It is simple and predictable but means memories completely disappear from scoring after a fixed period. Exponential decay drops quickly at first, then slowly, never reaching zero. It works well for domains where old information might still be relevant but should rank lower. Power-law decay (which ACT-R uses) drops quickly at first but preserves a longer tail than exponential, matching human forgetting curves most accurately. For most AI memory applications, power-law decay produces the best results.
Combine relevance and recency using weighted addition. The weights determine how much each factor influences the final ranking. A typical starting point is 60 percent relevance and 40 percent recency, which means a highly relevant but older memory can still outrank a less relevant but newer one, but not by much.
def combined_score(relevance, timestamp, relevance_weight=0.6,
recency_weight=0.4, decay_fn=recency_power_law):
rel = normalize_relevance(relevance)
rec = decay_fn(timestamp)
# normalize power-law recency to 0-1 range
rec_norm = min(rec, 1.0)
return (relevance_weight * rel) + (recency_weight * rec_norm)Some memories should be exempt from recency decay. Core reference documents, architecture decisions, and foundational knowledge remain valid indefinitely. Mark these with a flag (like a high confidence score or a "protected" status) and set their recency score to 1.0 regardless of age. This prevents important, stable knowledge from being pushed out of results by newer but less authoritative content.
def combined_score_with_protection(memory, relevance, decay_fn):
rel = normalize_relevance(relevance)
if memory.get('confidence', 0) >= 8.0:
# high-confidence memories are protected from recency decay
rec_norm = 1.0
else:
rec = decay_fn(memory['created_at'])
rec_norm = min(rec, 1.0)
return (0.6 * rel) + (0.4 * rec_norm)The right recency-to-relevance ratio depends on how quickly information changes in your domain. Customer support and product documentation change frequently, so increase recency weight to 50 or 60 percent. Legal research and academic knowledge change slowly, so decrease recency weight to 20 or 30 percent. Personal assistants fall in between. The best way to tune is to build a test set of queries where you know the ideal ranking and adjust weights until the test results match.
| Domain | Relevance Weight | Recency Weight | Rationale |
|---|---|---|---|
| Customer support | 40-50% | 50-60% | Product info changes with each release |
| Development tools | 50-60% | 40-50% | Code patterns evolve but core concepts persist |
| Personal assistant | 55-65% | 35-45% | Mix of stable preferences and changing context |
| Research/legal | 70-80% | 20-30% | Historical information often still valid |
Beyond Simple Recency: Base-Level Activation
Simple timestamp-based recency has a limitation: it only considers when the memory was created (or last updated), not how often it has been accessed. A memory created a month ago that has been retrieved twenty times is clearly more important than a memory created yesterday that has never been used, but simple recency scoring ranks the yesterday memory higher.
Base-level activation from ACT-R solves this by incorporating both recency and frequency. Every access event contributes to the activation score, weighted by how recently it occurred. This means a frequently accessed memory maintains high activation even as it ages, while a rarely accessed memory decays quickly. Adaptive Recall uses base-level activation instead of simple recency, which produces more accurate rankings for memory stores with significant usage history.
Get recency-aware retrieval without building the scoring pipeline. Adaptive Recall combines base-level activation with vector relevance on every query.
Get Started Free