Dumi Erhan, co-lead of the Veo project at Google DeepMind, joins host Logan Kilpatrick for a deep dive into the evolution of generative video models. They discuss the journey from early research in 2018 to the launch of state-of-the-art Veo 3 model with native audio generation. Learn about the technical hurdles in evaluating and scaling video models, the challenges of long-duration video coherence and how user feedback is shaping the future of AI-powered video creation.Chapter: 0:00 - Intro0:47 - Veo project's beginnings3:02 - Veo's origins in Google Brain5:07 - Video prediction and robotics applications7:45 - Early progress and evaluation challenges10:30 - Physics-based evaluations and their limitations12:18 - The launch of the original Veo model14:06 - Scaling challenges for video models16:02 - The leap from Veo1 to Veo219:40 - Veo 3’s viral audio moment21:17 - User trends shaping Veo's roadmap23:49 - Image-to-video vs. text-to-video complexity26:00 - New prompting methods and user control27:55 - Coherence in long video generation31:03 - Genie 3 and world models35:54 - The steerability challenge41:59 - Capability transfer and image data's role47:25 - Closing
--------
48:10
--------
48:10
GDM’s Pushmeet Kohli on solving science's biggest challenges with AI
Pushmeet Kohli, Head of Science and Strategic Initiatives at Google DeepMind, joins host Logan Kilpatrick to explore the intersection of AI and scientific discovery. Learn how the team's unique problem-solving framework led to innovations like AlphaFold and AlphaEvolve, and how new tools like AI Co-scientist aim to democratize these types of breakthroughs for everyone. Watch on YouTube: https://www.youtube.com/watch?v=o7mdsL6BHskChapters: 0:00 - Intro1:04 - Recent Alpha launches02:15 - Framework for selecting research domains06:21 - Scientific, commercial and social impact15:00 - Wielding AGI for breakthroughs16:48 - Tech transfer and team collaboration19:46 - IMO Gold Medal21:42 - Evaluating math proofs22:55 - From specialized models to Deep Think24:22 - Do math skills generalize?25:53 - Generalizing the IMO model27:43 - Democratizing AI science tools30:09 - AI Co-scientist35:17 - An API for science?
--------
37:28
--------
37:28
Behind the scenes of Google's state-of-the-art "nano-banana" image model
Join host Logan Kilpatrick in discussion with some of the minds behind Google's new state-of-the-art image model, Gemini 2.5 Flash. Product and research leads from the Gemini team break down the technology behind its key capabilities, including interleaved generation for complex edits and new approaches to achieving character consistency and pixel-perfect control. With Nicole Brichtova, Kaushik Shivakumar, Mostafa Dehghani and Robert Riachi. Watch on YouTube: Chapters:0:37 - New model introduction1:21 -Demo - Image Editing3:44 - Text rendering capabilities4:44 Beyond human preference evals6:44 - Text rendering as a proxy for quality8:38 - Positive transfer between modalities11:25 - Demo - Multi-turn, context aware image generation13:54 - Pixel-perfect editing and character consistency15:51 - Interleaved image generation17:59 - Specialized vs. native models19:52 - Understanding nuanced prompts20:59 - User feedback shaping model development22:37 - Improvements in character consistency24:17 - More natural looking images from team collaboration26:41 - What’s next for image generation models
--------
30:32
--------
30:32
Demis Hassabis on shipping momentum, better evals and world models
Demis Hassabis, CEO of Google DeepMind, sits down with host Logan Kilpatrick. In this episode, learn about the evolution from game-playing AI to today's thinking models, how projects like Genie 3 are building world models to help AI understand reality and why new testing grounds like Kaggle’s Game Arena are needed to evaluate progress on the path to AGI.Watch on YouTube: https://www.youtube.com/watch?v=njDochQ2zHsChapters:00:00 - Intro01:16 - Recent GDM momentum02:07 - Deep Think and agent systems04:11 - Jagged intelligence07:02 - Genie 3 and world models10:21 - Future applications of Genie 313:01 - The need for better benchmarks and Kaggle Game Arena19:03 - Evals beyond games21:47 - Tool use for expanding AI capabilities24:52 - Shift from models to systems27:38 - Roadmap for Genie 3 and the omni model29:25 - The quadrillion token club
--------
31:09
--------
31:09
Building real-time voice applications with Live API
Shrestha Basu Mallick, one of the product leads for the Gemini API, joins host Logan Kilpatrick for a deep dive of Gemini Live API, Google’s real-time, multimodal interface for developers. Learn about how native audio alongside new capabilities like proactive audio and async function calling unlocks the unique power of audio as an interface.Watch on YouTube: https://www.youtube.com/watch?v=4xlwlU6h-wM0:00 - Intro1:18 - Live API Overview3:36 - Why audio is a special modality5:07 - Speed vs. precision in audio6:17 - Controllable and promptable TTS8:31 - What developers are building with the Live API11:14 - URL context and async calling features15:02 - Proactive audio and affective dialog16:55 - Addressing developer feedback21:54 - Live API roadmap23:49 - The role of long context24:57 - What’s next for the Live API26:41 - State of the AI audio market30:10 - Advice for developers getting started with the Live API31:16 - Live API demo38:10 - Demo wrap up and closing
Ever wondered what it's really like to build the future of AI? Join host Logan Kilpatrick for a deep dive into the world of Google AI, straight from the minds of the builders. We're pulling back the curtain on the latest breakthroughs, sharing the unfiltered stories behind the tech, and answering the questions you've been dying to ask.
Whether you're a seasoned developer or an AI enthusiast, this podcast is your backstage pass to the cutting-edge of AI technology. Tune in for:
- Exclusive interviews with AI pioneers and industry leaders.
- In-depth discussions on the latest AI trends and developments.
- Behind-the-scenes stories and anecdotes from the world of AI.
- Unfiltered insights and opinions from the people shaping the future.
So, if you're ready to go beyond the headlines and get the real scoop on AI, join Logan Kilpatrick on Google AI: Release Notes.