How to Build Multi-Turn Conversation Flows
Before You Start
You need a clear understanding of the business process your flow will implement. Map it on paper first: what information needs to be collected, in what order, what decisions branch the flow, and what the possible outcomes are. Talk to the people who currently handle this process manually (customer service agents, salespeople, account managers) because they know the edge cases and common sticking points that formal process documentation misses. You also need a state persistence mechanism (Redis, database, or session storage) because multi-turn flows must survive page refreshes, network interruptions, and session timeouts.
Step-by-Step Implementation
Model your conversation flow as a finite state machine where each state represents a point in the conversation, and transitions between states are triggered by user inputs or system decisions. Each state has: a name (descriptive, like "collect_shipping_address" or "confirm_refund_amount"), the information it needs to collect (the "slot" to fill), the prompt or question to ask the user, validation rules for the user's response, and outgoing transitions (which state to go to next, based on the validated response). Draw this as a diagram before writing code. Even a simple flow like "collect name, collect email, confirm, submit" has edge cases: what if the user provides both name and email in a single message? What if they want to change their name after providing their email? What if they ask an unrelated question in the middle of the flow? Your state machine design should account for these scenarios.
REFUND_FLOW = {
"start": {
"prompt": "I can help with a refund. What is the order number?",
"slot": "order_number",
"validate": lambda v: v.strip().startswith("ORD-") and len(v) > 6,
"error_msg": "Order numbers start with ORD- followed by digits.",
"next": "lookup_order"
},
"lookup_order": {
"action": "lookup_order_details",
"branches": {
"found": "confirm_item",
"not_found": "order_not_found",
"already_refunded": "already_refunded"
}
},
"confirm_item": {
"prompt": "I found order {order_number} for {item_name} ({amount}). "
"Is this the order you want to refund?",
"slot": "confirmed",
"validate": lambda v: v.lower() in ["yes", "no", "y", "n"],
"branches": {
"yes": "select_reason",
"no": "start"
}
},
"select_reason": {
"prompt": "What is the reason for the refund? Options: "
"defective, wrong item, changed mind, other.",
"slot": "reason",
"next": "process_refund"
},
"process_refund": {
"action": "submit_refund",
"next": "confirmation"
},
"confirmation": {
"prompt": "Your refund for {amount} has been submitted. "
"You should see the credit within 5-7 business days.",
"terminal": True
}
}Slot filling tracks what information has been collected and what remains. Each slot has a name, a type (text, number, enum, date, boolean), validation rules, and an optional extraction function that can pull the value from natural language rather than requiring the user to respond in a specific format. The extraction function is important because users do not answer one question at a time. A user asked "What is your name?" might respond "I'm Sarah Chen and my email is sarah@example.com," filling two slots in one message. Your slot filling logic should attempt to extract all defined slots from every message, not just the slot currently being asked about. When a user provides multiple pieces of information at once, acknowledge all of them and skip to the first unfilled slot rather than asking questions they have already answered.
class SlotFiller:
def __init__(self, flow_definition):
self.slots = {}
self.filled = {}
for state in flow_definition.values():
if "slot" in state:
self.slots[state["slot"]] = {
"filled": False,
"value": None,
"validate": state.get("validate"),
"error_msg": state.get("error_msg", "Invalid input.")
}
async def extract_from_message(self, message, llm_client):
extraction_prompt = f"""Extract any of these values from the message:
{json.dumps(list(self.slots.keys()))}
Message: {message}
Return a JSON object with slot names as keys and extracted values.
Only include slots where a value is clearly stated."""
result = await llm_client.extract(extraction_prompt)
extracted = json.loads(result)
for slot_name, value in extracted.items():
if slot_name in self.slots:
validator = self.slots[slot_name]["validate"]
if validator is None or validator(str(value)):
self.filled[slot_name] = value
self.slots[slot_name]["filled"] = True
return self.filled
def next_unfilled(self):
for name, slot in self.slots.items():
if not slot["filled"]:
return name
return NoneBranch points are states where the next state depends on the user's response, a system lookup, or business logic. Define branches as a mapping from conditions to target states. Conditions can be simple (the user said "yes" vs "no"), value-based (the order amount is above or below the auto-refund threshold), or logic-based (the user is a premium subscriber and eligible for instant processing). Keep branching logic outside the conversation generation layer: the LLM should not decide which branch to take for critical business flows. Instead, the state machine evaluates the branching condition deterministically and tells the LLM which state to execute next. This prevents the model from hallucinating eligibility or skipping required steps in regulated processes.
Users will provide invalid inputs, ambiguous responses, and completely off-topic messages during a flow. Each requires different handling. For invalid inputs (a non-numeric value when a number is required), provide a clear error message and re-ask the same question. Limit retries to 3 attempts before offering an alternative (free-text input that you parse with an LLM, or escalation to a human). For ambiguous responses ("maybe" when you need yes or no), ask a clarifying question that narrows the options. For off-topic messages (the user asks "what time is it?" mid-flow), answer the off-topic question briefly and then redirect back to the flow: "It is 3:15 PM. Going back to your refund, could you confirm the order number?" Never ignore off-topic messages by pretending the user did not say them, because that destroys trust. Acknowledge and redirect.
As a multi-turn flow progresses, the context sent to the LLM should include: the flow definition (what steps exist and what the current step is), all filled slots with their values, the last 3 to 5 messages of conversation history, and any data retrieved by system actions (order details, account information). Do not include the entire conversation history from the beginning of the flow, because early turns (greeting, initial question) are not relevant to the current step and waste tokens. The system prompt should instruct the model to stay within the flow's current step, use the filled slot values naturally in its responses, and not jump ahead to future steps or revisit completed steps unless the user explicitly asks to go back.
Users leave mid-flow for many reasons: they get distracted, they need to find information (like an order number), or they decide to continue later. Your flow should persist its state so that when the user returns, they can resume from where they left off rather than starting over. Serialize the current state (current step, filled slots, retrieved data) to persistent storage keyed by user ID plus flow ID. When the user returns, detect whether they have an in-progress flow and offer to resume: "You were in the middle of requesting a refund for order ORD-12345. Would you like to continue from where you left off?" If the user says yes, restore the state and proceed from the next unfilled slot. If the user says no or does not respond to the resumption prompt, archive the flow state and start fresh. Set an expiration on saved flow states (typically 24 to 72 hours) after which the partial flow is abandoned.
Multi-Turn Flows with Memory
Persistent memory transforms multi-turn flows from isolated procedures into personalized experiences. When a returning user starts a refund flow, memory can pre-fill slots with known information: the user's preferred contact method, their most recent order, their shipping address. This reduces the number of questions the chatbot needs to ask, sometimes cutting a 7-step flow to 3 steps. Memory also enables flow learning: if a user has completed the same flow multiple times, the system can streamline the process based on their patterns. A user who always selects "defective item" as the refund reason and always wants email confirmation can be fast-tracked through those steps.
Cross-session flow continuity is where memory provides the most dramatic improvement. Without memory, a user who says "I want to do the same thing as last time" gets a blank stare from the chatbot. With memory, the system can recall the last flow this user completed, identify the relevant parameters, and offer to repeat it with the same settings. This kind of learned efficiency is what makes users feel like the chatbot actually knows them, and it is only possible with a memory system that persists across sessions and retrieves contextually relevant information based on what the user is doing right now.
Build smarter flows with memory-powered context. Adaptive Recall pre-fills known information, learns from completed flows, and provides cross-session continuity that users love.
Try It Free