Yuri Nazarov
Publishes as Jurijs Nazarovs
Applied Scientist · Amazon
San Jose, CA
I build AI systems that perceive and reason about people, activity, and scenes in video — at the intersection of computer vision and foundation models.
My doctoral research focused on deep probabilistic models and the dynamics of temporal data — neural ODEs, Bayesian neural networks, and generative models — with applications spanning brain imaging, physical simulation, and cyber-defense. I actively contribute to the research community through publications and writing.
Focus Computer Vision · Vision-Language Models · Foundation Models · Video Understanding · Multimodality · Grounded Activity Recognition · Scene Understanding · Person Re-identification · Generative Models
Selected publicationsAll 9 →
Experience
Leading AI research across video understanding and vision-language models for Alexa. Fine-tuned video VLMs for per-person grounded activity recognition with Florence-2-style task-specific tokens, cutting inference-time token cost by ~60%; architected a multi-camera VLM pipeline for scene understanding (~20% recall lift over a single-camera baseline); trained LLMs for context-aware response generation with a product-aligned LLM-as-judge evaluation pipeline; and designed a tiered-cache gallery for scalable, low-latency person re-identification (~40% recall lift).
Led a natural-language video search project, deploying image-tagging models on-device with quantization-aware fine-tuning and a novel dynamic weighted frame-sampling method. Built a zero-shot detection and segmentation pipeline (Grounding DINO + SAM, optimized with EfficientViT) and directed an incremental-learning effort that cut data annotation costs by 2×.
Developed a novel adversarial training method to make a UNITER-style vision-language VQA system robust to linguistic variation and image manipulation.
Designed Bayesian Neural Networks for the cyber-defense domain to handle sparse, class-imbalanced, and limited datasets — a human-in-the-loop alarming system for ransomware detection. Resulted in a first-author publication and a U.S. patent.
Research on retrieval of missing classes in ordinal time series with Cristian Lumezanu, leading to the Ordinal-Quadruplet framework.
Education
University of Wisconsin–Madison
Deep probabilistic models and the dynamics of temporal data.
University of Wisconsin–Madison
Duke University
HSE University (Higher School of Economics)
News
New paper — GHADAR, on grounded human-attributed activity recognition in video — submitted to ECCV 2026.
Joined Amazon as an Applied Scientist, working on video understanding and multimodal AI.
Joined Ambient.ai as an Applied Research Scientist in Computer Vision and foundation models.
Defended my PhD in Statistics at UW–Madison.
Started a research internship at Amazon Alexa AI.
Paper accepted to AI4CC (workshop at CVPR 2022).


