Home » Conversational AI » Frameworks Compared

Chatbot Frameworks: Rasa vs Botpress vs Custom

Choosing a chatbot framework determines your development speed, deployment flexibility, and long-term maintenance burden. Rasa provides an open-source framework with full control over NLU and dialogue management. Botpress offers a visual builder with managed infrastructure and LLM integration. Custom-built solutions using direct LLM APIs give maximum flexibility at the cost of building everything yourself. Each approach serves different team sizes, technical requirements, and budget constraints.

Rasa: Open-Source Framework

Rasa is an open-source conversational AI framework that provides intent classification, entity extraction, dialogue management, and action execution in a self-hosted package. It was originally built around traditional NLU models (DIET classifier, SpaCy, Transformer-based models) that you train on your own data, but recent versions have added LLM integration for handling open-ended conversations alongside the structured NLU pipeline.

The architecture centers on three components: the NLU pipeline (which processes user messages into intents and entities), the dialogue management model (which predicts the next action based on conversation state), and the action server (which executes custom code like API calls, database queries, and business logic). Conversation flows are defined in YAML files that specify training stories (example conversation paths) and rules (deterministic behaviors that override the learned dialogue model). This declarative approach means conversation logic is version-controlled, reviewable, and testable.

Rasa's strengths are control and data ownership. Everything runs on your infrastructure, your training data never leaves your servers, and you have full access to every component of the pipeline. You can customize the NLU model, add custom components to the processing pipeline, and implement any dialogue management logic in Python. For enterprises with data sovereignty requirements, healthcare applications with HIPAA constraints, or government projects with security clearance requirements, Rasa's self-hosted architecture is often the only viable option.

The weakness is complexity. Setting up Rasa requires machine learning knowledge for training the NLU model, DevOps skills for deploying and scaling the infrastructure (Rasa runs multiple services: the Rasa server, the action server, a tracker store, an event broker), and ongoing maintenance for retraining models as user patterns evolve. A team deploying Rasa should expect 2 to 4 weeks of initial setup and ongoing weekly maintenance for model tuning, flow updates, and infrastructure management. Rasa's community edition is free, but the enterprise edition (Rasa Pro) with advanced features costs $50,000 or more per year.

Memory support in Rasa is limited to tracker stores that record conversation events and slots that store extracted entities within a session. Cross-session memory requires custom implementation: building an external memory store, adding recall logic to your action server, and injecting recalled context into the conversation. Rasa does not provide built-in persistent memory, knowledge graph integration, or cognitive recall. You would need to integrate an external memory service like Adaptive Recall through the action server to add these capabilities.

Botpress: Visual Builder with LLM Integration

Botpress is a platform that combines a visual conversation flow builder with LLM-powered natural language understanding and generation. It provides a drag-and-drop interface for designing conversation flows, built-in channel integrations (web, WhatsApp, Slack, Teams, SMS), analytics dashboards, and managed cloud hosting. Recent versions have deeply integrated LLMs for intent detection, entity extraction, and response generation, moving beyond the traditional pattern-matching approach.

The architecture centers on the visual flow builder, where conversations are designed as node graphs with entry points, decision nodes, action nodes, and exit points. Each node can contain LLM prompts, conditional logic, API calls, or scripted responses. The Knowledge Base feature allows uploading documents that the chatbot can reference in conversations through built-in RAG. Variables and user profiles track information within and across sessions, and hooks allow custom code injection at various points in the conversation lifecycle.

Botpress's strengths are speed of deployment and accessibility. A non-technical team member can build a functional chatbot in hours using the visual builder, without writing code. The managed hosting eliminates infrastructure concerns, and the built-in analytics provide immediate visibility into conversation patterns, drop-off points, and user satisfaction. Channel integrations are pre-built, so deploying to WhatsApp or Slack requires configuration rather than development. For teams that need a chatbot quickly and do not have dedicated AI engineering resources, Botpress gets results faster than any other approach.

The weakness is the ceiling on customization. Complex dialogue management logic, custom NLU components, non-standard integrations, and advanced memory patterns all require working within Botpress's extension mechanisms (hooks and custom actions), which are more constrained than building from scratch. Performance at scale can also be a concern: managed hosting adds latency that you cannot optimize away, and pricing scales with message volume, which can become expensive for high-traffic applications. The Knowledge Base feature provides basic RAG but does not offer the cognitive scoring, entity graph traversal, or memory lifecycle management that production applications need.

Memory support in Botpress includes user variables that persist across sessions and a conversation history store. These cover basic personalization (remembering the user's name, plan, preferences) but do not provide semantic memory recall, entity-based retrieval, or memory that evolves and improves over time. Adding advanced memory requires integrating an external service through Botpress's webhook or API call nodes. The integration is straightforward but means the advanced memory logic lives outside the visual builder, which fragments the conversation design experience.

Custom-Built: Direct LLM API

Custom-built chatbots use the LLM provider's API directly (Anthropic's Claude API, OpenAI's API, Google's Gemini API) and implement all conversation logic in application code. There is no framework, no visual builder, and no abstraction layer between your code and the model. You write the system prompt, assemble the context, make the API call, process the response, and handle all state management, error handling, and feature integration yourself.

The architecture is whatever you design it to be. A minimal implementation is a single API endpoint that receives user messages, appends them to an in-memory conversation history, calls the LLM API, and returns the response. A production implementation includes: a context assembly layer that gathers system prompt, conversation history, recalled memories, retrieved documents, and tool definitions; a generation layer that handles model selection, streaming, retries, and content filtering; a state management layer that persists conversation state across requests; a memory layer that extracts and stores knowledge from conversations; and a tool execution layer that handles function calls requested by the model.

The strengths of custom-built are total control, maximum performance, and no vendor lock-in beyond the LLM provider. Every token in the context is intentionally placed, every decision is explicit in code, and there is no framework overhead adding latency or consuming context space. You can switch LLM providers, change models, modify context assembly, add custom features, and optimize performance without framework constraints. For teams building a core product on conversational AI (where the chatbot is the product, not a feature), custom-built provides the foundation for differentiation.

The weakness is development cost. Everything that frameworks provide out of the box (channel integrations, analytics, user management, conversation logging, content safety, A/B testing) must be built or integrated manually. A production-quality custom chatbot requires 2 to 6 months of engineering effort from an experienced team, compared to days or weeks with a framework. The ongoing maintenance burden is also higher because every component is your responsibility: security patches, API migration when providers change their interfaces, scaling infrastructure, and debugging production issues without framework-provided tooling.

Memory support in custom-built systems is entirely up to you. This is both the biggest advantage and the biggest challenge. You can integrate any memory system with any recall strategy, implement any extraction pipeline, and design any memory lifecycle. But you have to build it all yourself, or integrate a purpose-built memory service that provides the memory layer as a turnkey solution. Adaptive Recall is designed for exactly this use case: it provides the entire memory layer (storage, recall with cognitive scoring, entity graph, lifecycle management) through a standard API or MCP interface, so custom-built chatbots can add production-grade memory without implementing it from scratch.

Decision Framework

Choose Rasa if: you need self-hosted infrastructure for data sovereignty or compliance, you have ML engineering resources for NLU model training and maintenance, you need fine-grained control over every component of the NLU pipeline, and you are willing to invest 2 to 4 weeks in initial setup for long-term flexibility.

Choose Botpress if: you need a chatbot deployed quickly (days, not months), your team includes non-technical members who will manage conversations, you need pre-built channel integrations with minimal configuration, and your conversation complexity fits within the visual builder's capabilities.

Choose custom-built if: the chatbot is your core product and differentiation matters, you need maximum performance and control over every aspect of the conversation, you have experienced AI engineering resources, and you need to integrate deeply with existing systems in ways that frameworks do not support.

In all three cases, add a dedicated memory service for persistent, cross-session memory with cognitive recall. None of the three approaches provide production-grade memory out of the box, and memory is the single highest-impact capability you can add to any conversational AI system regardless of its underlying architecture.

Add production-grade memory to any framework. Adaptive Recall integrates with Rasa, Botpress, custom builds, and any LLM-based chatbot through MCP or REST API.

Get Started Free