Chris Lattner of Modular (https://modular.com) joined us (again!) to talk about how they are breaking the CUDA monopoly, what it took to match NVIDIA performance with AMD, and how they are building a company of "elite nerds".
X: https://x.com/latentspacepod
Substack: https://latent.space
00:00:00 Introductions
00:00:12 Overview of Modular and the Shape of Compute
00:02:27 Modular’s R&D Phase
00:06:55 From CPU Optimization to GPU Support
00:11:14 MAX: Modular’s Inference Framework
00:12:52 Mojo Programming Language
00:18:25 MAX Architecture: From Mojo to Cluster-Scale Inference
00:29:16 Open Source Contributions and Community Involvement
00:32:25 Modular's Differentiation from VLLM and SGLang
00:41:37 Modular’s Business Model and Monetization Strategy
00:53:17 DeepSeek’s Impact and Low-Level GPU Programming
01:00:00 Inference Time Compute and Reasoning Models
01:02:31 Personal Reflections on Leading Modular
01:08:27 Daily Routine and Time Management as a Founder
01:13:24 Using AI Coding Tools and Staying Current with Research
01:14:47 Personal Projects and Work-Life Balance
01:17:05 Hiring, Open Source, and Community Engagement
--------
The Utility of Interpretability — Emmanuel Amiesen
Emmanuel Amiesen is lead author of “Circuit Tracing: Revealing Computational Graphs in Language Models” (https://transformer-circuits.pub/2025/attribution-graphs/methods.html ), which is part of a duo of MechInterp papers that Anthropic published in March (alongside https://transformer-circuits.pub/2025/attribution-graphs/biology.html ).
We recorded the initial conversation a month ago, but then held off publishing until the open source tooling for the graph generation discussed in this work was released last week: https://www.anthropic.com/research/open-source-circuit-tracing
This is a 2 part episode - an intro covering the open source release, then a deeper dive into the paper — with guest host Vibhu Sapra (https://x.com/vibhuuuus ) and Mochi the MechInterp Pomsky (https://x.com/mochipomsky ). Thanks to Vibhu for making this episode happen!
While the original blogpost contained some fantastic guided visualizations (which we discuss at the end of this pod!), with the notebook and Neuronpedia visualization (https://www.neuronpedia.org/gemma-2-2b/graph ) released this week, you can now explore on your own with Neuronpedia, as we show you in the video version of this pod.
Chapters
00:00 Intro & Guest Introductions
01:00 Anthropic's Circuit Tracing Release
06:11 Exploring Circuit Tracing Tools & Demos
13:01 Model Behaviors and User Experiments
17:02 Behind the Research: Team and Community
24:19 Main Episode Start: Mech Interp Backgrounds
25:56 Getting Into Mech Interp Research
31:52 History and Foundations of Mech Interp
37:05 Core Concepts: Superposition & Features
39:54 Applications & Interventions in Models
45:59 Challenges & Open Questions in Interpretability
57:15 Understanding Model Mechanisms: Circuits & Reasoning
01:04:24 Model Planning, Reasoning, and Attribution Graphs
01:30:52 Faithfulness, Deception, and Parallel Circuits
01:40:16 Publishing Risks, Open Research, and Visualization
01:49:33 Barriers, Vision, and Call to Action
Solomon most famously created Docker and now runs Dagger… which has something special to share with you on Thursday.
Catch Dagger at:
- Tuesday: Dagger’s workshop https://www.ai.engineer/schedule#ship-agents-that-ship-a-hands-on-workshop-for-swe-agent-builders
- Wednesday: Dagger’s talk: https://www.ai.engineer/schedule#how-to-trust-an-agent-with-software-delivery
- Thursday: Solomon’s Keynote https://www.ai.engineer/schedule#containing-agent-chaos
Chapters
00:00 Introduction & Guest Background
00:29 What is Dagger? Post-Development Automation
01:08 Dagger’s Community & Platform Engineers
02:32 AI Agents and Developer Workflows
03:40 Environment Isolation & The Power of Containers
06:28 The Need for Standards in Agent Environments
07:25 Design Constraints & Challenges for Dev Environments
11:26 Limitations of Current Tools & Agent-Native UX
14:11 Modularity, Customization, and the Lego Analogy
16:24 Convergence of CICD and Agentic Systems
17:41 Ephemeral Apps, Resource Constraints, and Local Execution
21:01 Adoption, Ecosystem, and the Role of Open Source
23:30 Dagger’s Modular Approach & Integration Philosophy
25:38 Looking Ahead: Workshops, Keynotes, and the Future of Agentic Infrastructure
--------
[AIEWF Preview] CloudChef: Your Robot Chef - Michellin-Star food at $12/hr (w/ Kitchen tour!)
One of the new tracks at next week’s AI Engineer conference in SF is a new focus on LLMs + Robotics, ft. household names like Waymo and Physical Intelligence. However there are many other companies applying LLMs and VLMs in the real world!
CloudChef, the first industrial-scale kitchen robotics company with one-shot demonstration learning and an incredibly simple business model, will be serving tasty treats all day with Zippy (https://www.cloudchef.co/zippy ) their AI Chef platform.
This is a lightning pod with CEO Nikhil Abraham to preview what Zippy is capable of!
https://www.cloudchef.co/platform
See a real chef comparison: https://www.youtube.com/watch?v=INDhZ7LwSeo&t=64s
See it in the AI Engineer Expo at SF next week: https://ai.engineer
Chapters
00:00 Welcome and Introductions
00:58 What is Cloud Chef?
01:36 How the Robots Work: Culinary Intelligence
05:57 Commercial Applications and Early Success
07:02 The Software-First Approach
10:09 Business Model and Pricing
13:10 Demonstration Learning: Training the Robots
16:03 Call to Action and Engineering Opportunities
18:45 Final Thoughts and Technical Details
--------
The AI Coding Factory
We are joined by Eno Reyes and Matan Grinberg, the co-founders of Factory.ai. They are building droids for autonomous software engineering, handling everything from code generation to incident response for production outages. After raising a $15M Series A from Sequoia, they just released their product in GA!
https://factory.ai/
https://x.com/latentspacepod
Chapters
00:00:00 Introductions
00:00:35 Meeting at Langchain Hackathon
00:04:02 Building Factory despite early model limitations
00:06:56 What is Factory AI?
00:08:55 Delegation vs Collaboration in AI Development Tools
00:10:06 Naming Origins of 'Factory' and 'Droids'
00:12:17 Defining Droids: Agent vs Workflow
00:14:34 Live Demo
00:17:37 Enterprise Context and Tool Integration in Droids
00:20:26 Prompting, Clarification, and Agent Communication
00:22:28 Project Understanding and Proactive Context Gathering
00:24:10 Why SWE-Bench Is Dead
00:28:47 Model Fine-tuning and Generalization Challenges
00:31:07 Why Factory is Browser-Based, Not IDE-Based
00:33:51 Test-Driven Development and Agent Verification
00:36:17 Retrieval vs Large Context Windows for Cost Efficiency
00:38:02 Enterprise Metrics: Code Churn and ROI
00:40:48 Executing Large Refactors and Migrations with Droids
00:45:25 Model Speed, Parallelism, and Delegation Bottlenecks
00:50:11 Observability Challenges and Semantic Telemetry
00:53:44 Hiring
00:55:19 Factory's design and branding approach
00:58:34 Closing Thoughts and Future of AI-Native Development
The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0.
We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al.
Full show notes always on https://latent.space