Folgen

  • Optical Networking Supercycle - ALL the Tech You NEED to know
    Feb 20 2026

    Austin and Vik delve into the evolving landscape of optics and networking, particularly in relation to AI and data centers.

    The conversation covers various scales of networking, including scale across, scale out, and scale up, while also addressing the demand-supply dynamics in laser manufacturing and the future of optical circuit switches.

    The episode highlights the technological advancements and market opportunities in the optics sector, emphasizing the significance of these developments for the future of AI.

    Takeaways

    • Silicon photonics is becoming crucial for data center connectivity.
    • Optics is essential for overcoming copper's limitations in speed and distance.
    • Scale across technology is vital for connecting data centers.
    • Scale out optics is the standard for connecting GPUs between racks.
    • Co-packaged optics can reduce energy consumption in data centers.
    • The scale up market for optics is emerging as a new opportunity.
    • Indium phosphide wafers are a critical bottleneck in laser manufacturing.
    • Optical circuit switches are gaining traction in data centers.
    • 2026 is anticipated to be a pivotal year for optical networking.


    Chapters

    00:00 Introduction to AI and CPU Bottlenecks
    03:00 The Rise of Silicon Photonics
    06:01 Understanding Optical Networking and Data Centers
    08:49 Scale Across: Connecting Data Centers
    11:56 Scale Out: Optimizing Data Center Connectivity
    14:53 Scale Up: The Future of GPU Connectivity
    23:32 The Shift from Copper to Optical Connections
    26:13 Challenges and Reliability of Lasers
    30:47 Understanding Co-Packaged Optics
    34:17 Market Dynamics: Demand and Supply of Lasers
    40:46 Emerging Technologies: Optical Circuit Switches

    Check out Austin's Substack: https://www.chipstrat.com
    Check out Vik's Substack: https://www.viksnewsletter.com

    Mehr anzeigen Weniger anzeigen
    46 Min.
  • Memory Mayhem & AI Capex Madness
    Feb 13 2026

    In this episode of the Semi Doped podcast, Austin and Vik delve into the current state of the semiconductor industry, focusing on the memory crisis driven by increasing demand from AI applications. They discuss the implications of rising memory prices, the impact of hyperscaler spending on the market, and the strategic moves of major players like Google, Microsoft, Meta, and Amazon in the AI landscape.

    Takeaways

    • Memory prices are skyrocketing, impacting consumer electronics.
    • The memory crisis is affecting the production of lower-end devices.
    • DRAM prices have doubled in a single quarter, creating challenges for manufacturers.
    • Nanya Tech's revenue growth indicates a booming memory market.
    • AI applications are driving unprecedented demand for memory.
    • Hyperscalers are significantly increasing their capital expenditures for AI infrastructure.
    • The integration of AI into advertising is reshaping business models for companies like Google and Meta.

    Chapters

    00:00 The State of Memory in Semiconductors
    03:08 Nvidia's GPU Dilemma and Market Dynamics
    06:13 The Impact of AI on Memory Demand
    09:08 NAND Flash and Context Memory Trends
    11:59 The Future of Memory Supply and Demand
    15:12 AI Infrastructure and CapEx Spending
    17:47 Google's Strategic Investments in AI
    20:58 The Advertising Business Model and AI Integration
    30:26 Revenue vs. Expenses: A Balancing Act
    31:08 The Future of TPUs vs. GPUs in Cloud Computing
    35:31 Microsoft vs. Google: AI Investments and Market Reactions
    38:22 AI Integration in Enterprises: Microsoft’s Unique Position
    39:57 The Power of Microsoft’s Reach in AI
    40:30 GitHub: A Hidden Gem for Microsoft’s AI Strategy
    43:52 Meta’s AI Strategy: Advertising and Revenue Growth
    51:18 Amazon’s Massive CapEx: Implications for the Future
    54:00 Looking Ahead: Predictions for 2027 and Beyond

    Check out Austin's substack: https://www.chipstrat.com/
    Check out Vik's substack: https://www.viksnewsletter.com/

    Mehr anzeigen Weniger anzeigen
    59 Min.
  • The future of financing AI infrastructure with Wayne Nelms, CTO of Ornn
    Feb 10 2026

    In this episode, Vik and Wayne Nelms discuss the emerging financial exchange for GPU compute, exploring its implications for the AI infrastructure market. They discuss the value of compute, pricing dynamics, hedging strategies, and the future of GPU and memory trading.

    Wayne shares insights on partnerships, the depreciation of GPUs, and how inference demand may reshape hardware utilization. The conversation highlights the importance of financial products in facilitating data center development and optimizing profitability in the evolving landscape of compute resources.

    Takeaways

    • Wayne Nelms is the CTO of Ornn, focusing on GPU compute as a commodity.
    • The value of compute is still being defined in the market.
    • Hedging strategies are essential for managing compute costs.
    • The pricing of GPUs varies significantly across providers.
    • Memory trading is becoming a crucial aspect of the compute market.
    • Partnerships can enhance trading platforms and market efficiency.
    • Depreciation of GPUs is not linear and varies by use case.
    • Inference demand may change how GPUs are utilized in the future.
    • Transparency in pricing benefits smaller players in the market.
    • Financial products can facilitate data center development and profitability.

    Chapters

    00:00 Introduction to GPU Compute Futures

    03:13 The Value of Compute in Today's Market

    05:59 Understanding GPU Pricing Dynamics

    08:46 Hedging and Futures in Compute

    11:52 The Role of Memory in AI Infrastructure

    15:14 Partnerships and Market Expansion

    17:46 Depreciation and Residual Value of GPUs

    20:57 Future of Data Centers and Compute Demand

    24:01 The Impact of Financialization on AI Infrastructure

    27:04 Looking Ahead: The Future of Compute Markets

    Keywords

    GPU compute, financial exchange, futures market, data centers, AI infrastructure, pricing strategies, hedging, memory trading, Ornn

    Follow Wayne Nelms (@wayne_nelmz on X)

    Check out Ornn's website: https://www.ornnai.com/

    Check out Vik's Substack: https://www.viksnewsletter.com/

    Check out Austin's Substack: https://www.chipstrat.com/

    Mehr anzeigen Weniger anzeigen
    41 Min.
  • A New Era of Context Memory with Val Bercovici from WEKA
    Feb 6 2026

    Vik and Val Bercovici discuss the evolution of storage solutions in the context of AI, focusing on Weka's innovative approaches to context memory, high bandwidth flash, and the importance of optimizing GPU usage.

    Val shares insights from his extensive experience in the storage industry, highlighting the challenges and advancements in memory requirements for AI models, the significance of latency, and the future of storage technologies.

    Takeaways

    • Context memory is crucial for AI performance.
    • The demand for memory has drastically increased.
    • Latency issues can hinder AI efficiency.
    • High bandwidth flash offers new storage capabilities.
    • Weka's Axon software enhances GPU storage utilization.
    • Token warehouses can significantly reduce costs.
    • Augmented memory grids improve memory access speeds.
    • Networking innovations are essential for AI storage solutions.
    • Understanding memory hierarchies is vital for optimization.
    • The future of storage will involve more advanced technologies.

    Chapters

    00:00 Introduction to Weka and AI Storage Solutions
    05:18 The Evolution of Context Memory in AI
    09:30 Understanding Memory Hierarchies and Their Impact
    16:24 Latency Challenges in Modern Storage Solutions
    21:32 The Role of Networking in AI Storage Efficiency
    29:42 Dynamic Resource Utilization in AI Networks
    30:04 Introducing the Context Memory Network
    31:13 High Bandwidth Flash: A Game Changer
    32:54 Weka's Neural Mesh and Storage Solutions
    35:01 Axon: Transforming GPU Storage into Memory
    39:00 Augmented Memory Grid Explained
    42:00 Pooling DRAM and CXL Innovations
    46:02 Token Warehouses and Inference Economics
    52:10 The Future of Storage Innovations

    Resources

    Manus AI $2B Blog: https://manus.im/blog/Context-Engineering-for-AI-Agents-Lessons-from-Building-Manus

    Also listen to this podcast on your favorite platform. https://www.semidoped.fm/

    Check out Vik's Substack: https://www.viksnewsletter.com/
    Check out Austin's Substack: https://www.chipstrat.com/

    Mehr anzeigen Weniger anzeigen
    54 Min.
  • OpenClaw Makes AI Agents and CPUs Get Real
    Feb 3 2026

    Austin and Vik discuss the emerging trend of AI agents, particularly focusing on Claude Code and OpenClaw, and the resulting hardware implications.

    Key Takeaways:

    • 2026 is expected to be a pivotal year for AI agents.
    • The rise of agentic AI is moving beyond marketing to practical applications.
    • Claude Code is being used for more than just coding; it aids in research and organization.
    • Integrating AI with tools like Google Drive enhances productivity.
    • Security concerns arise with giving AI agents access to personal data.
    • Local computing options for AI can reduce costs and increase control.
    • AI agents can automate repetitive tasks, freeing up human time for creative work.
    • The demand for CPUs is increasing due to the needs of AI agents.
    • AI can help summarize and organize information but may lack deep insights.
    • The future of AI will involve balancing automation with human oversight.

    Chapters
    (00:00) Introduction: Why 2026 may be the year of AI agents
    (01:12) What people mean by agents and the OpenClaw naming chaos
    (02:41) Agents behaving badly: crypto losses and social posting
    (03:38) Claude Code as a research tool, not a coding tool
    (05:54) Terminal-first workflows vs GUI-based agents
    (07:44) Connecting Claude Code to Gmail, Drive, and Calendar via MCP
    (09:12) Token waste, authentication friction, and workflow optimization
    (10:54) Automating newsletter ingestion and research archives
    (12:33) Giving agents login credentials and security tradeoffs
    (13:50) Filtering signal from noise with topic constraints
    (16:36) AI-driven idea generation and its limitations
    (17:34) When automation effort is not worth it
    (19:02) Are agents ready for non-technical users?
    (20:55) Why OpenClaw should not run on your personal laptop
    (21:33) Safe agent deployment: VPS vs local servers
    (23:33) The true cost of agents: infrastructure plus inference
    (24:18) What OpenClaw adds beyond Claude Code
    (26:53) Agents require managerial thinking and self-awareness
    (28:18) Local inference vs cloud APIs
    (30:46) Cost control with OpenRouter and model hierarchies
    (32:31) Scaling agents forces model and cost optimization
    (33:00) AI aggregation vs creator analytics
    (35:58) AI as discovery, not a replacement for reading
    (38:17) When summaries are enough and when they are not
    (39:47) Why AI cannot understand what is not said
    (41:18) Agentic AI is driving unexpected CPU demand
    (41:49) Intel caught off guard by CPU shortages
    (44:53) Security, identity, and encryption shift work to CPUs
    (46:10) Closing thoughts: agents are real, early, and uneven

    Deploy your secure OpenClaw instance with DigitalOcean:
    https://www.digitalocean.com/blog/moltbot-on-digitalocean

    Visit the podcast website: https://www.semidoped.fm
    Austin's Substack: https://www.chipstrat.com/
    Vik's Substack: https://www.viksnewsletter.com/

    Mehr anzeigen Weniger anzeigen
    48 Min.
  • An Interview with Microsoft's Saurabh Dighe About Maia 200
    Jan 28 2026

    Maia 100 was a pre-GPT accelerator.
    Maia 200 is explicitly post-GPT for large multimodal inference.

    Saurabh Dighe says if Microsoft were chasing peak performance or trying to span training and inference, Maia would look very different. Higher TDPs. Different tradeoffs. Those paths were pruned early to optimize for one thing: inference price-performance. That focus drives the claim of ~30% better performance per dollar versus the latest hardware in Microsoft’s fleet.

    Intereting topics include:
    • What “30% better price-performance” actually means
    • Who Maia 200 is built for
    • Why Microsoft bet on inference when designing Maia back in 2022/2023
    • Large SRAM + high-capacity HBM
    • Massive scale-up, no scale-out
    • On-die NIC integration

    Maia is a portfolio platform: many internal customers, varied inference profiles, one goal. Lower inference cost at planetary scale.

    Chapters:
    (00:00) Introduction
    (01:00) What Maia 200 is and who it’s for
    (02:45) Why custom silicon isn’t just a margin play
    (04:45) Inference as an efficient frontier
    (06:15) Portfolio thinking and heterogeneous infrastructure
    (09:00) Designing for LLMs and reasoning models
    (10:45) Why Maia avoids training workloads
    (12:00) Betting on inference in 2022–2023, before reasoning models
    (14:40) Hyperscaler advantage in custom silicon
    (16:00) Capacity allocation and internal customers
    (17:45) How third-party customers access Maia
    (18:30) Software, compilers, and time-to-value
    (22:30) Measuring success and the Maia 300 roadmap
    (28:30) What “30% better price-performance” actually means
    (32:00) Scale-up vs scale-out architecture
    (35:00) Ethernet and custom transport choices
    (37:30) On-die NIC integration
    (40:30) Memory hierarchy: SRAM, HBM, and locality
    (49:00) Long context and KV cache strategy
    (51:30) Wrap-up

    Mehr anzeigen Weniger anzeigen
    53 Min.
  • Can Pre-GPT AI Accelerators Handle Long Context Workloads?
    Jan 26 2026

    OpenAI's partnership with Cerebras and Nvidia's announcement of context memory storage raises a fundamental question: as agentic AI demands long sessions with massive context windows, can SRAM-based accelerators designed before the LLM era keep up—or will they converge with GPUs?

    Key Takeaways
    1. Context is the new bottleneck. As agentic workloads demand long sessions with massive codebases, storing and retrieving KV cache efficiently becomes critical.
    2. There's no one-size-fits-all. Sachin Khatti's (OpenAI, ex-Intel) signals a shift toward heterogeneous compute—matching specific accelerators to specific workloads.
    3. Cerebras has 44GB of SRAM per wafer — orders of magnitude more than typical chips — but the question remains: where does the KV cache go for long context?
    4. Pre-GPT accelerators may converge toward GPUs. If they need to add HBM or external memory for long context, some of their differentiation erodes.
    5. Post-GPT accelerators (Etched, MatX) are the ones to watch. Designed specifically for transformer inference, they may solve the KV cache problem from first principles.

    Chapters
    - 00:00 — Intro
    - 01:20 — What is context memory storage?
    - 03:30 — When Claude runs out of context
    - 06:00 — Tokens, attention, and the KV cache explained
    - 09:07 — The AI memory hierarchy: HBM → DRAM → SSD → network storage
    - 12:53 — Nvidia's G1/G2/G3 tiers and the missing G0 (SRAM)
    - 14:35 — Bluefield DPUs and GPU Direct Storage
    - 15:53 — Token economics: cache hits vs misses
    - 20:03 — OpenAI + Cerebras: 750 megawatts for faster Codex
    - 21:29 — Why Cerebras built a wafer-scale engine
    - 25:07 — 44GB SRAM and running Llama 70B on four wafers
    - 25:55 — Sachin Khatti on heterogeneous compute strategy
    - 31:43 — The big question: where does Cerebras store KV cache?
    - 34:11 — If SRAM offloads to HBM, does it lose its edge?
    - 35:40 — Pre-GPT vs Post-GPT accelerators
    - 36:51 — Etched raises $500M at $5B valuation
    - 38:48 — Wrap up

    Mehr anzeigen Weniger anzeigen
    38 Min.
  • An Interview with Innoviz CEO Omer Keilaf about current LiDAR market dynamics
    Jan 22 2026

    Innoviz CEO Omer Keilaf believes the LIDAR market is down to its final players—and that Innoviz has already won its seat.

    In this conversation, we cover the Level 4 gold rush sparked by Waymo, why stalled Level 3 programs are suddenly accelerating, the technical moat that separates L4-grade LIDAR from everything else, how a one-year-old startup won BMW, and why Keilaf thinks his competitors are already out of the race.

    Omer Keilaf founded Innoviz in 2016. Today it's a publicly traded Tier 1 supplier to BMW, Volkswagen, Daimler Truck, and other global OEMs.

    Chapters
    00:00 Introduction
    00:17 Why Start a LIDAR Company in 2016?
    01:32 The Personal Story Behind Innoviz
    03:12 Transportation Is Still Our Biggest Daily Risk
    04:28 The 2012 Spark: Xbox Kinect and 3D Sensing
    06:32 From Mobile to Automotive: Finding the Right Platform
    07:54 "I Didn't Know What LIDAR Was, But I'd Do It Better"
    08:19 How a One-Year-Old Startup Won BMW
    10:04 Surviving the First Product
    11:23 From Tier 2 to Tier 1: The Volkswagen Win
    13:47 Lessons Learned Scaling Through Partners
    14:45 The SPAC Decision: A Wake-Up Call from a Competitor
    16:42 From 200 LIDAR Companies to a Handful
    17:27 NREs: How Tier 1 Status Funds R&D
    18:44 Why Automotive-First Is the Right Strategy
    19:45 Consolidation Patterns: Cameras, Radars, Airbags
    20:31 "The Music Has Stopped"
    21:07 Non-Automotive: Underserved Markets
    23:51 Working with Secretive OEMs
    25:27 The Press Release They Tried to Stop
    26:42 CES 2025: 85% of Meetings Were Level 4
    27:40 Why Level 3 Programs Are Suddenly Accelerating
    28:33 The EV/ADAS Coupling Problem
    29:49 Design Is Everything: The Holy Grail Is Behind the Windshield
    31:13 The Three-Year RFQ: Grill → Roof → Windshield
    32:32 Innoviz3: Small Enough for Behind-the-Windshield
    34:40 Innoviz2 for L4, Innoviz3 for Consumer L3
    36:38 What's the Real Difference Between L2, L3, and L4 LIDAR?
    38:51 The Mud Test: Why L4 Demands 100% Availability
    40:50 "We're the Only LIDAR Designed for Level 4"
    42:52 Patents and the Maslow Pyramid of Autonomy
    44:15 Non-Automotive Markets: Agriculture, Mining, Security
    46:15 Closing

    Mehr anzeigen Weniger anzeigen
    47 Min.