COMPOZELABS
AI VOICE SALES COACHING
THE STORY
I was brought on to solo-build a voice-interactive sales coaching tool where users practice real-time sales pitches against a voice AI modeled on real people. The POV was estimated to take 4-6 weeks—I delivered it in 1 week. The challenge: voice agents aren't very good because they have to be lightweight (voice is data-intensive), so the models aren't as smart and it's harder to get them to follow instructions. I solved this with a novel 'observer pattern' architecture where reasoning models inject real-time guidance into the lightweight voice AI.
QUICK STATS
DEVELOPMENT INSIGHTS
"I was brought on because they specifically wanted me to tackle this task. I essentially worked one-on-one with the client—emailing back and forth in the evening like 'hey, check out an update.' This client was pre-seed, building their MVP, so it could change fast. I just made a whole bunch of pushes to production and got immediate feedback. It was a wildly fast development cycle. The hardest part—building that voice AI—was done in like a week and a half. Most of it was just tweaking based on how it understood sales."
TECHNICAL CHALLENGES
The Observer Pattern
Voice agents have to be lightweight because voice is data-intensive, so the models under the hood aren't as smart. We got around this by having observer agents looking at the transcript in real time. These observer agents used the latest reasoning models which are really intelligent and we could put really large instructions into those. They would look at the transcript and inject guidance—guidance came in the form of additional user input wrapped in tags indicating 'Hey, this is not actually the user speaking, this is kind of like the system instructions.' Would say stuff like: 'Oh, you need to object more to this part' or 'You need to wrap up the call now.' In this way, we got a more intelligent, better instruction-following voice agent by having this dual system.
Getting Agent to Hang Up
Voice mode by itself will never stop talking to the user. We had to give it a custom tool for hanging up—a tool that was essentially the 'end yourself' tool that would make it so the agent could end its own process, hang up the phone, and close the connection.
LinkedIn Profile Integration
It could search a user's LinkedIn profile. You just plugged in their LinkedIn and it retrieved everything, scraped the LinkedIn page for everything there was to know. It created structured output—a profile representation that included personality, job, personal details. It would also make up additional stuff if information wasn't available that we wanted. It then used that as part of the system prompt for the voice AI.
KEY FEATURES
- ✦Observer Pattern ArchitectureReasoning models inject real-time guidance into lightweight voice AI
- ✦Multi-Observer SystemDifferent observers handle different tasks (objections, timing, feedback)
- ✦LinkedIn-Powered PersonasScrapes LinkedIn to create realistic AI personas for practice
- ✦Custom Hang-Up ToolVoice AI can end conversation naturally when objectives are met
- ✦Real-Time Transcript AnalysisObservers analyze conversation as it happens with minimal latency
IMPACT & RESULTS
- •Delivered in 1 week vs 4-6 week estimate (67% faster)
- •25% improvement in sales representative pass-rates
- •Colleagues coined the term 'Silas velocity' referring to development speed
- •Novel architecture applicable to other voice AI quality problems
- •Client feedback: wildly fast development cycle with daily iteration
RELATED WORK
Check out Mystica and Mercury Notes for related projects
