Home » Cognitive Scoring » Combine Recency and Relevance

How to Combine Recency and Relevance in Ranking

Combining recency and relevance means blending "how well does this match the query" with "how current is this information" into a single ranking score. Pure relevance ignores that information becomes outdated. Pure recency ignores that the most recent result might not be the best answer. The right blend depends on your domain: fast-changing domains need more recency weight, stable domains need more relevance weight.

Before You Start

You need two inputs for each candidate: a relevance score (from vector similarity, cross-encoder, or similar) and a timestamp (creation date, last update, or last access time). If you are using Adaptive Recall, base-level activation already captures recency along with access frequency, providing a richer signal than simple timestamp-based recency. This guide covers both approaches so you can choose the right one for your system.

Step-by-Step Implementation

Step 1: Compute semantic relevance scores.
Start with whatever relevance scoring you already have. Vector cosine similarity typically returns values between 0 and 1. Cross-encoder scores might be on a different scale and need normalization. The key requirement is that all candidates have a relevance score on the same scale, so you can blend it meaningfully with the recency score.
def normalize_relevance(similarity_score): # cosine similarity is already 0-1 for normalized vectors return max(0.0, min(1.0, similarity_score))
Step 2: Compute recency scores.
The simplest recency score is a function of time elapsed since the memory was created or last updated. Newer memories score higher, older memories score lower. The challenge is choosing the right function shape: how fast should the score decrease, and should it ever reach zero? Three common approaches are linear decay, exponential decay, and power-law decay.
import time import math def recency_linear(timestamp, half_life_days=30): age_days = (time.time() - timestamp) / 86400 score = max(0.0, 1.0 - (age_days / (2 * half_life_days))) return score def recency_exponential(timestamp, half_life_days=30): age_days = (time.time() - timestamp) / 86400 return math.exp(-0.693 * age_days / half_life_days) def recency_power_law(timestamp, decay=0.5): age_seconds = max(time.time() - timestamp, 1.0) # normalize against a 1-day reference point return (86400 / age_seconds) ** decay
Step 3: Choose a decay function.
Linear decay drops steadily to zero at twice the half-life. It is simple and predictable but means memories completely disappear from scoring after a fixed period. Exponential decay drops quickly at first, then slowly, never reaching zero. It works well for domains where old information might still be relevant but should rank lower. Power-law decay (which ACT-R uses) drops quickly at first but preserves a longer tail than exponential, matching human forgetting curves most accurately. For most AI memory applications, power-law decay produces the best results.
Why power-law over exponential? Exponential decay treats "3 months old" and "6 months old" as very different, making the older memory essentially disappear. Power-law decay preserves more gradation between old and very old memories, which better reflects the fact that some old information remains valuable. This distinction matters most for knowledge bases that accumulate information over months or years.
Step 4: Blend the scores.
Combine relevance and recency using weighted addition. The weights determine how much each factor influences the final ranking. A typical starting point is 60 percent relevance and 40 percent recency, which means a highly relevant but older memory can still outrank a less relevant but newer one, but not by much.
def combined_score(relevance, timestamp, relevance_weight=0.6, recency_weight=0.4, decay_fn=recency_power_law): rel = normalize_relevance(relevance) rec = decay_fn(timestamp) # normalize power-law recency to 0-1 range rec_norm = min(rec, 1.0) return (relevance_weight * rel) + (recency_weight * rec_norm)
Step 5: Handle edge cases.
Some memories should be exempt from recency decay. Core reference documents, architecture decisions, and foundational knowledge remain valid indefinitely. Mark these with a flag (like a high confidence score or a "protected" status) and set their recency score to 1.0 regardless of age. This prevents important, stable knowledge from being pushed out of results by newer but less authoritative content.
def combined_score_with_protection(memory, relevance, decay_fn): rel = normalize_relevance(relevance) if memory.get('confidence', 0) >= 8.0: # high-confidence memories are protected from recency decay rec_norm = 1.0 else: rec = decay_fn(memory['created_at']) rec_norm = min(rec, 1.0) return (0.6 * rel) + (0.4 * rec_norm)
Step 6: Tune weights per domain.
The right recency-to-relevance ratio depends on how quickly information changes in your domain. Customer support and product documentation change frequently, so increase recency weight to 50 or 60 percent. Legal research and academic knowledge change slowly, so decrease recency weight to 20 or 30 percent. Personal assistants fall in between. The best way to tune is to build a test set of queries where you know the ideal ranking and adjust weights until the test results match.
DomainRelevance WeightRecency WeightRationale
Customer support40-50%50-60%Product info changes with each release
Development tools50-60%40-50%Code patterns evolve but core concepts persist
Personal assistant55-65%35-45%Mix of stable preferences and changing context
Research/legal70-80%20-30%Historical information often still valid

Beyond Simple Recency: Base-Level Activation

Simple timestamp-based recency has a limitation: it only considers when the memory was created (or last updated), not how often it has been accessed. A memory created a month ago that has been retrieved twenty times is clearly more important than a memory created yesterday that has never been used, but simple recency scoring ranks the yesterday memory higher.

Base-level activation from ACT-R solves this by incorporating both recency and frequency. Every access event contributes to the activation score, weighted by how recently it occurred. This means a frequently accessed memory maintains high activation even as it ages, while a rarely accessed memory decays quickly. Adaptive Recall uses base-level activation instead of simple recency, which produces more accurate rankings for memory stores with significant usage history.

Get recency-aware retrieval without building the scoring pipeline. Adaptive Recall combines base-level activation with vector relevance on every query.

Get Started Free