Traditional Chatbot vs LLM: Differences That Actually Matter
How a chatbot with intent classification works, how one based on LLM works, and when to choose each approach with concrete examples.
Contributors: Esther Aznar
To follow this post, you need basic Python knowledge. The code examples are short and commented, but it helps to recognize the syntax.
If you’ve ever typed “I want to speak to a person” into a website’s support chat and the bot replied “I didn’t understand. Choose an option: 1. Orders 2. Returns”, you already know firsthand what a traditional chatbot is. That rigidity isn’t a design flaw. It’s exactly how they work.
How a Traditional Chatbot Works
A traditional chatbot works like a book with fixed responses. First, you define all the topics (called intents) that the chatbot can understand. For example:
-
Topic 1: The user wants to check their balance
-
Topic 2: The user wants to block their card
When the user types something, the chatbot searches for which of these topics it most closely matches. If it recognizes it and is confident it’s that topic, it executes the pre-prepared response. If it doesn’t understand, it says “I didn’t understand”.
INTENTS = {
"check_balance": {
# Phrases that trigger this intent
"examples": ["check my balance", "how much money do I have"],
"response": "Your current balance is €340."
},
"block_card": {
"examples": ["block my card", "my card was stolen"],
"response": "Call 900 123 456 to block your card."
}
}
def process_message(message):
for intent_name, data in INTENTS.items():
for phrase in data["examples"]:
if phrase in message.lower(): # Search for exact match
return data["response"]
return "I didn't understand. Choose: 1. Balance 2. Card"
Beyond recognizing topics, the chatbot also extracts specific data from the message. If you say “block my Visa card”, the topic is “block card” and the specific data is “Visa”. The system then knows which of your cards you want to block.
The big problem with this type of chatbot is that you have to teach it all the possible ways users might ask for the same thing. If someone says “my card doesn’t work” but you only taught it phrases like “block card” or “cancel card”, the chatbot won’t understand. You have to think of all possible ways to ask for each thing, or let an automated tool learn from your examples.
When does this approach work well? When the user writes exactly what the system expects. These two questions it handles without problems:
-
“What are your store hours?” →
hoursintent, fixed response. -
“I want to place an order” →
new_orderintent, opens the checkout flow.
But if the user writes “I was charged something strange this week and I’m not sure if it’s from the subscription or something else”, the chatbot has no intent for that. It can’t reason about ambiguity. It returns “I didn’t understand” and the conversation ends there.
How a Chatbot with LLM Works
An LLM (Large Language Model) is an artificial intelligence that has read millions of texts. It doesn’t search for responses in a predefined list; instead, it understands the meaning of what you write and generates a new response.
The difference is like comparing an employee who only knows 20 memorized answers with another who has studied thousands of documents on the subject and can reason and answer any question you ask them in a different way each time.
To create a chatbot with an LLM, the first thing you do is give it instructions written out. You tell it what role it has (for example, “you are the assistant of a bank”), how it should speak, and what things it can’t do (for example, “don’t make up customer data”). The model reads these instructions before answering any user question.
import anthropic
client = anthropic.Anthropic() # Requires ANTHROPIC_API_KEY as environment variable
# The system prompt defines the role and boundaries of the chatbot
SYSTEM_PROMPT = """You are the assistant of Example Bank.
Never make up balances or customer data."""
def respond(question):
# We send the question to the model along with the instructions
response = client.messages.create(
model="claude-opus-4-6", # For testing, claude-haiku-4-5-20251001 is cheaper
max_tokens=500,
system=SYSTEM_PROMPT,
messages=[{"role": "user", "content": question}]
)
return response.content[0].text # The text generated by the model
If the chatbot handles real user data, review legal requirements before sending it to an external provider.
That chatbot responds to “I was charged something strange this week and I’m not sure if it’s from the subscription or something else” without any problem. There’s no intent configured for it. The model understands the question and generates a useful response.
Also, the model remembers what happened earlier in the conversation. If the user said “I have two cards” and then asks “which one should I cancel?”, the model remembers both cards. This works because the model has a “memory” of the conversation. Although this memory has limits (it can’t remember infinitely long conversations).
The main risk of LLM is hallucination: the model can generate a response that sounds correct but is made up. If you ask it “how much balance do I have?”, an LLM without access to real data can fabricate a plausible number. That’s why the system prompt in the example includes “never make up balances or customer data”. It doesn’t eliminate the risk entirely, but it reduces it.
When to Use Each One
Traditional chatbot: Use it when questions always follow the same pattern. For example, a support menu with fixed options: “What do you need? 1. Check balance 2. Block card 3. Hours”.
Chatbot with LLM: Use it when users can ask in unexpected ways. For example, complex questions or those that mix multiple topics.
| Aspect | Traditional Chatbot | LLM Chatbot |
|---|---|---|
| Initial setup | High: you define each intent manually | Low: you write the system prompt |
| Cost per conversation | Very low | Depends on model and length |
| Questions off the script | Can’t respond | Responds with flexibility |
| Predictable responses | Yes, always the same fixed answer | No, they vary by context |
| Risk of making up data | No (hardcoded responses) | Yes, needs to be controlled |
| Maintenance | Add and edit intents | Adjust system prompt and test |
Choose traditional chatbot if:
- Responses are always the same (order status, hours, FAQs).
- The user follows a menu or predefined steps.
- You want to save costs and maximize predictability.
Choose chatbot with LLM if:
- Users ask varied or complex questions.
- A problem can be described in many different ways.
- You need flexibility in responses.
Examples:
-
Simple question: “What time do you close?” → Traditional chatbot. It has a fixed answer.
-
Complex question: “I bought a product that doesn’t work well. Can I return it if I only have half the packaging?” → LLM. It’s a specific situation that needs analysis.
Hybrid pattern: You can also combine the two. The chatbot tries to respond with intents. If it can’t, it passes the question to the LLM. This is common in production because you save costs: the LLM only steps in when you need it.
The downside is that you have to maintain two systems in parallel and synchronize when to switch from one to the other.
If you want to dive deeper into how to write effective instructions for the LLM, the post on prompt engineering for developers covers the patterns most used in real projects.
Frequently Asked Questions
Is an LLM chatbot always better than a traditional one?
No. For closed and repeatable flows, the traditional chatbot is more controllable and cheaper to operate. Adding an LLM where it’s not needed just adds cost and complexity.
Can the LLM make up responses?
Yes, and it’s the main risk. Hallucinations occur when the model generates text that sounds plausible but doesn’t correspond to reality. The way to mitigate it is to be explicit in the system prompt about what data the model can and can’t provide. If you need real accuracy, you have to connect the chatbot to external data sources instead of relying on what the model remembers from its training.
Can I use both approaches together?
Yes. The most common pattern is to use an intent classifier for frequent questions and the LLM as a fallback for everything else. You get speed and low cost for what’s predictable, and flexibility for what’s not.
What if the user asks questions in multiple languages?
The traditional chatbot only works well in the languages for which you’ve defined example phrases in each intent. The LLM directly understands multiple languages without additional configuration, though it’s good to indicate in the system prompt what language the model should respond in.