TL;DR
- Klarna’s AI handled 2.3M monthly chats (66% of support), saving $60M and matching human satisfaction scores
- Resolution time dropped from 11 minutes to 2 minutes; repeat inquiries decreased 25%
- 2025 reality check: quality suffered, complex issues escalated, trust eroded
- Best for: High-volume support operations—but with honest expectations about limits
- Key lesson: AI handles speed; humans handle judgment. You need both.
Klarna’s AI assistant did the work of 853 human agents and saved $60 million—then they brought humans back. Their evolution from AI-only to hybrid offers the honest blueprint the industry needs.
February 2024. Klarna announced numbers that made headlines worldwide.
Their new AI assistant, built with OpenAI, had handled 2.3 million customer conversations in its first month. That was two-thirds of all customer service chats. Across 23 markets. In 35 languages.
“The equivalent work of 700 full-time agents,” the company claimed.
Resolution times plummeted from 11 minutes to under 2 minutes. Customer satisfaction scores matched human agents. Repeat inquiries dropped 25%. Klarna projected $40 million in profit improvement.
The tech press erupted. This was the future. AI was replacing customer service. The transformation was here.
Then came 2025. And the rest of the story.
How OpenAI Powered the Initial Triumph
The numbers were real. The AI worked.
Klarna’s system handled routine customer inquiries with remarkable competence:
- “Where’s my payment?”
- “Can I change my due date?”
- “How do I return this item?”
- “What’s my current balance?”
These questions have clear answers. The AI found them instantly, 24 hours a day, in any language the customer spoke.
For high-volume, straightforward inquiries, the system delivered genuine value:
2.3 million conversations handled monthly
66% of all chats automated
2-minute resolution (down from 11 minutes)
25% fewer repeat inquiries
$60 million saved (updated from initial $40M projection)
853 agent-equivalents of work (up from 700)
The AI wasn’t hallucinating answers. It was connected to Klarna’s systems—billing, transactions, refunds, payment schedules. It could authenticate users, look up real data, and take real actions.
The Quality Problem
Then the complaints started surfacing.
Not about the routine questions. Those worked fine. The problem was everything else.
“I was incorrectly charged and the AI keeps giving me the same scripted response.”
“I can’t access my account and the bot won’t escalate to a human.”
“My refund has been delayed three weeks and I can’t reach anyone.”
CEO Sebastian Siemiatkowski eventually acknowledged the truth: “Cost was a predominant evaluation factor” in organizing support, and it resulted in “lower quality.”
The AI matched human satisfaction scores—on the questions it could handle. But when customers had genuine problems requiring judgment, empathy, or exception-making, they hit walls.
What AI Couldn’t Do
The pattern became clear. AI struggled with:
Disputes. “The merchant says I received the item. I didn’t. Who’s right?” This requires investigation, judgment, and sometimes choosing to believe the customer. AI can’t make that call.
Account access issues. Identity verification edge cases. Locked accounts. Suspicious activity flags. These need human judgment about risk and trust.
Refund delays. When something goes wrong in the payment chain, customers need someone who can actually fix it—not explain why the system shows it’s “processing.”
Emotional situations. Financial stress is real. Customers facing unexpected charges, overdrafts, or collection actions need empathy. AI satisfaction scores don’t measure whether someone felt heard.
The 2025 Pivot
Klarna didn’t abandon AI. They evolved.
The company began rehiring human agents. They brought support operations in-house, replacing outsourced models that had created additional distance from customers.
The new approach:
AI for routine inquiries. Payment management, order tracking, basic questions—still automated. Still 24/7. Still in 35+ languages.
Humans for complexity. Disputes, sensitive financial issues, anything requiring judgment. Real people who can make exceptions and build trust.
Instant handoff triggers. Customers can request a human representative immediately. No AI gauntlet to run through first.
“Customers should always have the option to speak with a human,” Siemiatkowski declared in May 2025.
The Honest Assessment
Klarna’s journey offers the clearest picture of AI customer service reality:
What AI does well:
- High-volume routine inquiries at scale
- 24/7 availability across languages
- Consistent, instant responses to common questions
- Significant cost reduction for straightforward support
What AI does poorly:
- Judgment calls requiring human discretion
- Emotional support during stressful situations
- Exception handling outside normal parameters
- Building trust when things go wrong
The $60 million savings was real. The efficiency gains were real. But optimizing for cost created quality problems that eroded customer trust—particularly dangerous in financial services where trust is the product.
The Hybrid Model
Klarna’s current approach represents the emerging consensus:
Tier 1 (AI): Handle everything with a clear, data-backed answer. Authenticate users. Look up information. Process standard requests. Resolve in 2 minutes.
Tier 2 (AI-assisted human): Complex queries where AI gathers information and humans decide. The AI does the research; the human applies judgment.
Tier 3 (Human-only): Disputes, complaints, escalations, sensitive situations. Real people with authority to resolve problems.
The escape hatch: Any customer can request a human at any point. No friction. No persuasion to stay with the bot.
This isn’t retreat from AI—it’s mature deployment. Using each capability where it actually works.
The Labor Reality
The uncomfortable truth: Klarna did reduce headcount.
The company’s workforce shrank through attrition and hiring freezes. AI handled work that humans used to do. The “700 agents” comparison wasn’t abstract—it reflected real jobs that no longer existed.
But the 2025 rehiring showed the limits. You can’t automate your way out of quality problems. The remaining humans were stretched too thin. Backlogs grew. Complaints escalated.
“Understaffing created backlogs that undermined the hybrid model’s potential,” analysis showed.
AI enables smaller teams to handle larger volumes—but “smaller” isn’t “zero.” The humans you keep become more important, not less.
The Pattern for Others
Klarna’s experience offers a blueprint:
Start with volume analysis. What percentage of inquiries have clear, data-backed answers? Those are AI candidates. The rest need humans.
Don’t measure only efficiency. Satisfaction scores on easy questions mask failure on hard ones. Track escalation rates, complaint trends, and social media sentiment.
Build the escape hatch from day one. Customers must be able to reach humans without friction. The AI proving it can handle the current question isn’t the same as proving it can handle this customer’s real problem.
Financial services need extra care. Money is emotional. Trust is fragile. The cost of losing a customer’s confidence exceeds the cost of a human conversation.
Plan for hybrid, not replacement. AI will handle more over time. But “more” isn’t “all.” The question is always where to draw the line, not whether to have one.
The Current State
Klarna operates one of the world’s largest AI customer service deployments. Millions of conversations. Dozens of markets. Real money saved.
They also employ human agents who handle what AI cannot. The ratio has shifted—fewer humans, more AI—but both remain.
The story isn’t “AI replaced customer service.” It’s “AI transformed customer service into something that requires less but different human work.”
The 2-minute resolution times are real. So is the option to talk to a person. The $60 million savings are real. So is the investment in human agents who handle the hard stuff.
That’s the honest picture. That’s the blueprint.