Skip to content
aiva
  • Product
  • Languages
  • Customers
  • Pricing
Sign inStart free
Product›Languages›Customers›Pricing›Start free
Already have an account? Sign in →

Ready when your customers are.

Start free, live in under four minutes. Or get a walkthrough in your inbox.

aiva

AI that answers every call, chat and text — in every Indian language.

All systems answering

Product

  • Voice
  • Web widget
  • SMS
  • Analytics
  • Integrations
  • Solutions

Company

  • Customers
  • About
  • Careers
  • Contact

Resources

  • Help center
  • Changelog
  • Status
  • Book a demo
  • Pricing

Legal

  • Privacy
  • Terms
  • DPA
  • GDPR
  • Cookies
  • Sub-processors
  • Security
© 2026 AIVA Technologies Pvt. Ltd.·Made with care in Rajkot, answering in 12 languages.
‹ Back to all posts
EngineeringNovember 14, 202513 min read

AIVA 2.0: the rebuild.

We rebuilt AIVA from the ground up in 2025 — faster, more languages, less infrastructure. Voice latency cut in half. Frankfurt and Virginia regions live. Here's what we changed and what it cost us.

AP
Arjun Patel
Co-founder

In early 2025 we made a call that most startups never make voluntarily: we stopped shipping features for four months and rebuilt the entire product from scratch. This is the story of why we did it, what we changed, and what it cost us.

The original AIVA was built fast. We had a working voice agent in six weeks, customers in production in three months. The code reflected that. We'd glued together four third-party services — a transcription API, a language model, a TTS engine, and Twilio — with a thin orchestration layer that was held together mostly by optimism. It worked. Until it didn't.

The performance ceiling

By mid-2025 we had 80 customers and were hitting walls we'd built ourselves. Voice latency was stubbornly stuck at 380ms average. Adding a new language meant touching five different configuration files and praying nothing broke. Our infrastructure bill was growing faster than our revenue because we were paying for three services to do what one well-designed system could do. And our on-call rotation was a nightmare — when something went wrong at 2am it was never obvious which of four vendors was the culprit.

The performance ceiling was the most visible problem. 380ms sounds fast, but in voice it isn't. A human conversation flows at under 200ms of response lag. Anything above 250ms starts to feel like the call is dropping. We were losing customers not because AIVA gave wrong answers, but because it felt slow — and in voice, feeling slow and being slow are the same thing.

What we rebuilt

The new stack collapses everything into a single inference pipeline we own end-to-end. Speech recognition, language understanding, response generation, and synthesis all run in one unified process instead of four API calls chained together. The round-trip that used to cross four network boundaries now crosses zero.

We also moved to regional deployment. The original AIVA ran from a single region in Mumbai. The rebuilt version runs in Mumbai, Frankfurt, and Virginia — with automatic routing to the closest region based on the caller's network path. European and North American deployments of our customers' voice agents now see latencies under 180ms. Mumbai customers see under 160ms.

The language pipeline is the part I'm most proud of. In AIVA 1.0, adding a language was a project. In AIVA 2.0, it's a config file and a model checkpoint. We've shipped six new languages since the rebuild launched — each in under two weeks — compared to six weeks minimum in the old system.

What it cost us

Four months of no new features. Two customers who couldn't wait and churned. A team that was exhausted by the end. And a difficult conversation with early investors who wanted to see the metrics moving.

We don't regret any of it. Every meaningful thing we've shipped in the last six months — analytics, the new language pipeline — was possible because of the 2.0 infrastructure. The rebuild was a six-month multiplier on every feature that came after it.

The lesson we'd pass on: if you're hitting performance ceilings in month 12, the ceiling is probably architectural, not algorithmic. You can't optimize your way out of a bad architecture. Sometimes the right call is to stop and rebuild.

The voice latency numbers today: 158ms average, 210ms p95. We shipped our first customer over 1 million calls last month with zero outages. That's what the rebuild bought us.

EngineeringVoiceML
Share
AP
Written by
Arjun Patel
Co-founder
More posts →

Keep reading

More from the team.

EngineeringApril 8, 20266 min

How we shaved 40ms off voice latency.

Inference pipeline rewrite, regional caching, and the surprising thing we found out about Twilio's WebRTC encoder.

By Rohan Mehta
EngineeringMarch 11, 202614 min

Rewriting the voice pipeline (and why we'd do it again).

What we learned from ripping out our original voice stack and building it again from scratch.

By Rohan Mehta

Like this? Get more.

One email a month. Engineering deep-dives, product launches, customer stories. No fluff.

4,200+ subscribers. Unsubscribe anytime.