Eight years bridging research and production ML—from early embeddings and BERT inference to T5-based query rewriting and tool-calling voice agents. Built evaluation frameworks and shipped experiments across Legal, Cloud, YouTube, and Conversational AI.
Conversational Agents & Food AI
Built voice ordering agent before modern multimodal LLMs—scaled from 1 pilot restaurant to hundreds across major fast food chains. Pioneered tool-calling patterns that predated structured output APIs.
- Guided migration to native Gemini 2.0 tool use—validating patterns we'd built years earlier before structured outputs existed.
- Created Gemini-based pipeline for cleaning audio training data to improve ASR quality.
- "Disfluency injection" system to mask LLM latency—injected natural hesitations during inference, trimmed silence during agent speech.
- Developed human eval system for accent conversion models with bias reduction analysis.
Google Research — YouTube Voice Search
Bridged Google Research and YouTube product teams. Shipped 8 launches using T5 transformers for voice query rewriting—+0.75% Voice Engaged Watchers (30% of YouTube's annual goal).
- Fine-tuned LLM for phonetic query corrections using contextual signals and personalization.
- In-memory ranking model for candidate selection. Inference latency <3ms.
- First personalized ASR rewriter on YouTube—85% click rate on corrections, scaled to 450M queries/day.
- Coordinated planning for 5 engineers across 3 reporting chains.
Area120 & Google Cloud
Early BERT adopter—built training and serving infrastructure for conversational AI that won major telco contract. Solved low-latency inference challenges before optimized transformer runtimes existed.
- Managed 4 engineers building automated hyperparameter tuning system.
- BERT serving at <20ms p99 latency with TPU support.
- Multi-headed model architecture reduced TPU cost 10x.
Google Legal — Data Scientist
Early adopter of embeddings for legal tech. Fine-tuned word2vec models on patent text before transformer-based embeddings existed.
- ML model for patent claim breadth prediction—replaced vendor, open-sourced the approach. Served 20M+ documents.
- Patent similarity search via custom-trained embeddings + BigQuery. 60% reduction in search effort.