How to Add Source Citations to AI Responses
Before You Start
Source citations only work when you have source material for the model to cite. This technique applies to RAG systems, memory-grounded systems, and any application where the model generates responses based on retrieved context. It does not apply to purely parametric generation where the model answers from its training data alone, because there is no retrievable source to point to. You need a knowledge base, document collection, or memory store with content that can be identified and linked.
You also need to decide on a citation format that works for your application. Inline citations (bracketed references within the text) work well for long-form responses where users need to verify specific claims. End-of-response source lists work well for short responses where inline references would be disruptive. Some applications use both: inline markers for specific claims plus a full source list at the bottom.
Step-by-Step Implementation
Every document, chunk, or memory in your knowledge base needs a stable identifier that the model can use in citations and that users can understand when they see it. Raw IDs like "chunk_a7f3b2c1" are useless to users. Meaningful identifiers like "Product Docs: Authentication, Section 3" or "Memory: Project Setup (2026-03-15)" give users enough context to understand what the source is without clicking through. Generate these identifiers when you index content, and include them in the metadata that gets passed to the model during retrieval.
# Tag sources during indexing
def index_document(doc, collection):
chunks = chunk_document(doc)
for i, chunk in enumerate(chunks):
chunk.source_id = f"{doc.title}, Section {i+1}"
chunk.source_url = f"/docs/{doc.slug}#section-{i+1}"
collection.add(chunk)When you pass retrieved documents to the model, label each chunk with its source identifier. The model needs to see the identifier alongside the content so it can reference it in its response. Format the context block clearly, with each source separated and labeled.
def format_context_with_sources(retrieved_chunks):
context = "SOURCES:\n\n"
for i, chunk in enumerate(retrieved_chunks):
context += f"[Source {i+1}: {chunk.source_id}]\n"
context += f"{chunk.content}\n\n"
return contextTell the model exactly how to cite sources, what format to use, and when citations are required. Be specific: "cite sources" is too vague. Specify the bracket format, require citations for every factual claim, and instruct the model on how to handle claims that are not supported by any source.
CITATION_PROMPT = """Answer the user's question using the
sources provided. Follow these citation rules:
1. After every factual claim, add a citation in brackets
referencing the source: [Source 1], [Source 2], etc.
2. Every factual claim MUST have at least one citation
3. If you synthesize information from multiple sources,
cite all of them: [Source 1, Source 3]
4. If the sources do not support a claim, do not make it.
Say what the sources do cover and note any gaps.
5. Do not fabricate source references. Only cite sources
that appear in the SOURCES section above.
{context_with_sources}
Question: {user_query}"""After the model generates a response, extract all citation references and verify two things: that each citation references a source that was actually provided (the model did not fabricate a source reference), and that the cited source actually supports the claim it is attached to. The first check is simple string matching. The second check requires comparing the claim to the cited passage using semantic similarity or entailment classification. Flag any citations that fail either check.
import re
def validate_citations(response, provided_sources):
# Extract all citation references
citations = re.findall(r'\[Source (\d+(?:,\s*\d+)*)\]',
response)
issues = []
for citation_match in citations:
source_nums = [int(n.strip())
for n in citation_match.split(",")]
for num in source_nums:
if num > len(provided_sources) or num < 1:
issues.append({
"type": "fabricated_source",
"reference": f"Source {num}"
})
# Extract claim-citation pairs for entailment check
sentences = split_into_sentences(response)
for sentence in sentences:
refs = re.findall(r'\[Source (\d+)\]', sentence)
claim = re.sub(r'\[Source \d+(?:,\s*\d+)*\]', '',
sentence).strip()
for ref in refs:
source = provided_sources[int(ref) - 1]
if not check_entailment(claim, source.content):
issues.append({
"type": "unsupported_citation",
"claim": claim,
"cited_source": ref
})
return issuesTransform the citation references in the response into interactive elements that let users access the source material. In a web interface, citations become links that open the source document at the relevant passage. In a chat interface, citations might expand to show a preview of the source when clicked. The key is making verification effortless: a user who wants to check a claim should be able to do so with a single click, not by searching through documents manually.
def render_citations(response, sources):
def replace_citation(match):
num = int(match.group(1))
source = sources[num - 1]
return (f'[{num}]')
rendered = re.sub(
r'\[Source (\d+)\]',
replace_citation,
response
)
# Add source list at bottom
rendered += '\n\nSources
'
for s in sources:
rendered += f'- {s.source_id}
'
rendered += '
'
return renderedCitation Quality Metrics
Track three metrics to evaluate your citation pipeline. Citation coverage measures the percentage of factual claims in responses that have at least one citation. Aim for 90% or higher. Citation precision measures the percentage of citations where the cited source actually supports the attributed claim. Aim for 85% or higher. Citation fabrication rate measures how often the model references sources that were not provided. This should be below 2%, and any fabricated citations should be caught by the validation step before reaching users.
If citation coverage is low (the model generates uncited claims frequently), strengthen the citation instructions in your prompt and add examples of properly cited responses. If citation precision is low (the model cites real sources but misattributes their content), add the entailment verification step and consider switching to a more capable model for generation. If the fabrication rate is high, the model is not following the citation instructions reliably, which usually means the prompt needs to be more explicit or the model needs to be swapped for one that follows instructions more carefully.
Build AI that shows its work. Adaptive Recall provides source-attributed memories with confidence scores and entity links, giving your citation pipeline verified facts to reference.
Get Started Free