Illustration for: OpenAI Customer Service: Klarna's $60M Savings and What They Learned the Hard Way
Real AI Stories
🍳 Let It Cook

OpenAI Customer Service: Klarna's $60M Savings and What They Learned the Hard Way

Klarna's OpenAI-powered assistant handled 2.3M chats monthly, saved $60M. Then quality suffered. Their human-AI hybrid pivot is the honest blueprint.

TL;DR

  • Klarna’s AI handled 2.3M monthly chats (66% of support), saving $60M and matching human satisfaction scores
  • Resolution time dropped from 11 minutes to 2 minutes; repeat inquiries decreased 25%
  • 2025 reality check: quality suffered, complex issues escalated, trust eroded
  • Best for: High-volume support operations—but with honest expectations about limits
  • Key lesson: AI handles speed; humans handle judgment. You need both.

Klarna’s AI assistant did the work of 853 human agents and saved $60 million—then they brought humans back. Their evolution from AI-only to hybrid offers the honest blueprint the industry needs.

February 2024. Klarna announced numbers that made headlines worldwide.

Their new AI assistant, built with OpenAI, had handled 2.3 million customer conversations in its first month. That was two-thirds of all customer service chats. Across 23 markets. In 35 languages.

“The equivalent work of 700 full-time agents,” the company claimed.

Resolution times plummeted from 11 minutes to under 2 minutes. Customer satisfaction scores matched human agents. Repeat inquiries dropped 25%. Klarna projected $40 million in profit improvement.

The tech press erupted. This was the future. AI was replacing customer service. The transformation was here.

Then came 2025. And the rest of the story.

How OpenAI Powered the Initial Triumph

The numbers were real. The AI worked.

Klarna’s system handled routine customer inquiries with remarkable competence:

  • “Where’s my payment?”
  • “Can I change my due date?”
  • “How do I return this item?”
  • “What’s my current balance?”

These questions have clear answers. The AI found them instantly, 24 hours a day, in any language the customer spoke.

For high-volume, straightforward inquiries, the system delivered genuine value:

2.3 million conversations handled monthly

66% of all chats automated

2-minute resolution (down from 11 minutes)

25% fewer repeat inquiries

$60 million saved (updated from initial $40M projection)

853 agent-equivalents of work (up from 700)

The AI wasn’t hallucinating answers. It was connected to Klarna’s systems—billing, transactions, refunds, payment schedules. It could authenticate users, look up real data, and take real actions.

The Quality Problem

Then the complaints started surfacing.

Not about the routine questions. Those worked fine. The problem was everything else.

“I was incorrectly charged and the AI keeps giving me the same scripted response.”

“I can’t access my account and the bot won’t escalate to a human.”

“My refund has been delayed three weeks and I can’t reach anyone.”

CEO Sebastian Siemiatkowski eventually acknowledged the truth: “Cost was a predominant evaluation factor” in organizing support, and it resulted in “lower quality.”

The AI matched human satisfaction scores—on the questions it could handle. But when customers had genuine problems requiring judgment, empathy, or exception-making, they hit walls.

What AI Couldn’t Do

The pattern became clear. AI struggled with:

Disputes. “The merchant says I received the item. I didn’t. Who’s right?” This requires investigation, judgment, and sometimes choosing to believe the customer. AI can’t make that call.

Account access issues. Identity verification edge cases. Locked accounts. Suspicious activity flags. These need human judgment about risk and trust.

Refund delays. When something goes wrong in the payment chain, customers need someone who can actually fix it—not explain why the system shows it’s “processing.”

Emotional situations. Financial stress is real. Customers facing unexpected charges, overdrafts, or collection actions need empathy. AI satisfaction scores don’t measure whether someone felt heard.

The 2025 Pivot

Klarna didn’t abandon AI. They evolved.

The company began rehiring human agents. They brought support operations in-house, replacing outsourced models that had created additional distance from customers.

The new approach:

AI for routine inquiries. Payment management, order tracking, basic questions—still automated. Still 24/7. Still in 35+ languages.

Humans for complexity. Disputes, sensitive financial issues, anything requiring judgment. Real people who can make exceptions and build trust.

Instant handoff triggers. Customers can request a human representative immediately. No AI gauntlet to run through first.

“Customers should always have the option to speak with a human,” Siemiatkowski declared in May 2025.

The Honest Assessment

Klarna’s journey offers the clearest picture of AI customer service reality:

What AI does well:

  • High-volume routine inquiries at scale
  • 24/7 availability across languages
  • Consistent, instant responses to common questions
  • Significant cost reduction for straightforward support

What AI does poorly:

  • Judgment calls requiring human discretion
  • Emotional support during stressful situations
  • Exception handling outside normal parameters
  • Building trust when things go wrong

The $60 million savings was real. The efficiency gains were real. But optimizing for cost created quality problems that eroded customer trust—particularly dangerous in financial services where trust is the product.

The Hybrid Model

Klarna’s current approach represents the emerging consensus:

Tier 1 (AI): Handle everything with a clear, data-backed answer. Authenticate users. Look up information. Process standard requests. Resolve in 2 minutes.

Tier 2 (AI-assisted human): Complex queries where AI gathers information and humans decide. The AI does the research; the human applies judgment.

Tier 3 (Human-only): Disputes, complaints, escalations, sensitive situations. Real people with authority to resolve problems.

The escape hatch: Any customer can request a human at any point. No friction. No persuasion to stay with the bot.

This isn’t retreat from AI—it’s mature deployment. Using each capability where it actually works.

The Labor Reality

The uncomfortable truth: Klarna did reduce headcount.

The company’s workforce shrank through attrition and hiring freezes. AI handled work that humans used to do. The “700 agents” comparison wasn’t abstract—it reflected real jobs that no longer existed.

But the 2025 rehiring showed the limits. You can’t automate your way out of quality problems. The remaining humans were stretched too thin. Backlogs grew. Complaints escalated.

“Understaffing created backlogs that undermined the hybrid model’s potential,” analysis showed.

AI enables smaller teams to handle larger volumes—but “smaller” isn’t “zero.” The humans you keep become more important, not less.

The Pattern for Others

Klarna’s experience offers a blueprint:

Start with volume analysis. What percentage of inquiries have clear, data-backed answers? Those are AI candidates. The rest need humans.

Don’t measure only efficiency. Satisfaction scores on easy questions mask failure on hard ones. Track escalation rates, complaint trends, and social media sentiment.

Build the escape hatch from day one. Customers must be able to reach humans without friction. The AI proving it can handle the current question isn’t the same as proving it can handle this customer’s real problem.

Financial services need extra care. Money is emotional. Trust is fragile. The cost of losing a customer’s confidence exceeds the cost of a human conversation.

Plan for hybrid, not replacement. AI will handle more over time. But “more” isn’t “all.” The question is always where to draw the line, not whether to have one.

The Current State

Klarna operates one of the world’s largest AI customer service deployments. Millions of conversations. Dozens of markets. Real money saved.

They also employ human agents who handle what AI cannot. The ratio has shifted—fewer humans, more AI—but both remain.

The story isn’t “AI replaced customer service.” It’s “AI transformed customer service into something that requires less but different human work.”

The 2-minute resolution times are real. So is the option to talk to a person. The $60 million savings are real. So is the investment in human agents who handle the hard stuff.

That’s the honest picture. That’s the blueprint.

FAQ

How much did Klarna's AI customer service save?

Klarna reported $60 million in savings, with their AI agent doing the equivalent work of 853 full-time human agents. Resolution times dropped from 11 minutes to under 2 minutes.

Why did Klarna bring back human agents after AI success?

CEO Sebastian Siemiatkowski admitted 'cost was a predominant evaluation factor' that resulted in 'lower quality.' Complex disputes, account access issues, and refund delays required human judgment that AI couldn't provide.

What does Klarna's AI handle vs. humans?

AI handles routine inquiries—payment management, order tracking, basic questions—across 35+ languages and 24/7. Humans manage disputes, sensitive financial issues, and any case where customers request a real person.

What's Klarna's current AI customer service approach?

A hybrid model: AI for speed and scale on routine tasks, humans for judgment and empathy on complex issues. Customers can always request to speak with a human immediately.

What lessons did Klarna learn about AI customer service?

AI excels at speed but not judgment. Cost-cutting in financial services erodes trust. Money-sensitive services demand both efficiency and human support. Understaffing creates backlogs that undermine any model.