NVIDIA · Latent Space

⚡️ Google's Open AI Strategy — Omar Sanseviero, Google DeepMind

May 25, 2026 · 29:59

Omar Sanseviero, head of Developer Experience at Google DeepMind, breaks down Gemma 4's novel architecture with per-layer embeddings that enable parameter offloading, allowing a 2B active parameter model to run fast on devices. He explains trade-offs between dense and MoE models, notes fine-tuning is declining as base models improve, and highlights Gemma 4's native multimodal support for audio, images, and short video. The team is growing in Singapore and India, and Kaggle's recent integration will help benchmark agent capabilities.

2 episodes