Folgen

  • Infinite, shareable volume storage with Hunter Leath, Archil CEO
    Jan 15 2026

    Hunter Leath, CEO of Archil, explains how they’re building a “universal storage engine” that sits between your apps and S3—making an S3 bucket behave like a fast, POSIX-compatible disk for containers, servers, and even Lambda. Along the way, we dig into how their SSD-backed clusters and custom protocol avoid the usual small-file pain and where this approach shines (and where it doesn’t).

    Follow Hunter:
    Twitter/X: https://twitter.com/jhleath
    Archil Twitter/X: https://twitter.com/archildata
    Archil: https://archil.com/

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 - Intro: Archil Data and “S3 as a disk”
    01:05 - Hunter’s background and the core pitch
    02:32 - The real problem: state management (S3 vs block storage)
    05:02 - SQLite on S3: what the stack looks like
    07:13 - The missing layer: durable SSD-backed clusters
    10:14 - Who uses this: unstructured data, CI/CD, Git, agents
    12:15 - Small files + Git performance and avoiding S3 request explosion
    16:22 - Why they built a new protocol (NFS vs Luster)
    20:00 - What gets written to S3: real files in your bucket
    22:29 - S3 limits, throttling, and the “keep it on SSD” escape hatch
    25:32 - Multi-cloud + R2, and why regions/latency matter
    32:10 - Pricing model: “pay only when data is active”
    34:41 - Tradeoffs: random reads and ultra-low-latency metal
    37:19 - Storage/compute separation and AI/agent-native workflows
    43:21 - YC timeline + the marketing challenge of a “universal layer”
    47:34 - Single-tenant clusters for enterprises and why it’s hard
    50:27 - Where the company is now, hiring, and how to try it (disk.new)

    Mehr anzeigen Weniger anzeigen
    55 Min.
  • Building search for AI systems with Chroma CTO Hammad Bashir
    Dec 18 2025

    Hammad Bashir, CTO of Chroma, joins the show to break down how modern vector search systems are actually built from local, embedded databases to massively distributed, object-storage-backed architectures. We dig into Chroma’s shared local-to-cloud API, log-structured storage on object stores, hybrid search, and why retrieval-augmented generation (RAG) isn’t going anywhere.

    Follow Hammad:
    Twitter/X: https://twitter.com/HammadTime
    LinkedIn: https://www.linkedin.com/in/hbashir
    Chroma: https://trychroma.com

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 – Introduction From high-school ASICs to CTO of Chroma
    01:04 – Hammad’s background and why vector search stuck
    03:01 – Why Chroma has one API for local and distributed systems
    05:37 – Local experimentation vs production AI workflows
    08:03 – What “unprincipled data” means in machine learning
    10:31 – From computer vision to retrieval for LLMs
    13:00 – Exploratory data analysis and why looking at data still matters
    16:38 – Promoting data from local to Chroma Cloud
    19:26 – Why Chroma is built on object storage
    20:27 – Write-ahead logs, batching, and durability
    26:56 – Compaction, inverted indexes, and storage layout
    29:26 – Strong consistency and reading from the log
    34:12 – How queries are routed and executed
    37:00 – Hybrid search: vectors, full-text, and metadata
    41:03 – Chunking, embeddings, and retrieval boundaries
    43:22 – Agentic search and letting models drive retrieval
    45:01 – Is RAG dead? A grounded explanation
    48:24 – Why context windows don’t replace search
    56:20 – Context rot and why retrieval reduces confusion
    01:00:19 – Faster models and the future of search stacks
    01:02:25 – Who Chroma is for and when it’s a great fit
    01:04:25 – Hiring, team culture, and where to follow Chroma

    Mehr anzeigen Weniger anzeigen
    1 Std. und 7 Min.
  • Scaling DuckDB in the cloud with MotherDuck CEO Jordan Tigani
    Dec 11 2025

    In this episode of Database School, Aaron Francis sits down with Jordan Tigani, co-founder and CEO of MotherDuck, to break down what DuckDB is, how MotherDuck hosts it in the cloud, and why analytics workloads are shifting toward embedded databases. They dig into Duck Lake, pricing models, scaling strategies, and what it really takes to build a modern cloud data warehouse.

    Follow Jordan:
    Twitter/X: https://twitter.com/jrdntgn
    LinkedIn: https://www.linkedin.com/in/jordantigani
    MotherDuck: https://motherduck.com

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 - Introduction
    01:44 - What DuckDB is and why embedded analytics matter
    04:03 - How MotherDuck hosts DuckDB in the cloud
    05:18 - Is MotherDuck like the “Turso for DuckDB”?
    07:38 - Isolated analytics per user and scaling to zero
    08:51 - The academic origins of DuckDB
    10:00 - From SingleStore to founding MotherDuck
    12:28 - Getting fired… and funded 12 days later
    16:39 - Jordan’s background: Kernel dev, BigQuery, and Product
    18:36 - Partnering with DuckDB Labs and avoiding a fork
    20:52 - Why MotherDuck targets startups and the long tail
    24:22 - Pricing lessons: why $25 was too cheap
    28:11 - Ducklings, instance sizing, and compute scaling
    34:16 - How MotherDuck separates compute and storage
    37:09 - Inside the AWS architecture and differential storage
    43:12 - Hybrid execution: joining local and cloud data
    45:14 - Analytics vs warehouses vs operational databases
    47:41 - Data lakes, Iceberg, and what Duck Lake actually is
    53:22 - When Duck Lake makes more sense than DuckDB alone
    56:09 - Who switches to MotherDuck and why
    58:02 - PG DuckDB and offloading analytics from Postgres
    1:00:49 - Who should use MotherDuck and why
    1:03:39 - Hiring plans and where to follow Jordan
    1:05:01 - Wrap-up

    Mehr anzeigen Weniger anzeigen
    1 Std. und 5 Min.
  • Just use Postgres with Denis Magda
    Dec 4 2025

    In this episode, Aaron talks with Dennis Magda, author of Just Use Postgres!, about the wide world of modern Postgres, from JSON and full-text search to generative AI, time-series storage, and even message queues. They explore when Postgres should be your go-to tool, when it shouldn’t, and why understanding its breadth helps developers build better systems.

    Use the code DBSmagda to get 45% off Denis' new book Just Use Postgres!
    Order Just Use Postgres!

    Follow Denis:
    Twitter/X: https://twitter.com/denismagda
    LinkedIn: https://www.linkedin.com/in/dmagda


    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 – Welcome
    01:28 – Dennis’ Background: Java, JVM, and Databases
    03:20 – Bridging Application Development & Databases
    04:05 – Moving Down the Stack: How Dennis Entered Databases
    07:28 – Apache Ignite, Distributed Systems & the Path to Postgres
    08:02 – Writing Just Use Postgres!: The Origin Story
    10:26 – Why a Modern Postgres Book Was Needed
    11:01 – The Spark That Led to the Book Proposal
    13:06 – Developers Still Don’t Know What Postgres Can Do
    15:40 – Connecting With Manning & Refining the Book Vision
    16:38 – What Just Use Postgres! Covers
    17:40 – The Book’s Core Thesis: The Breadth of Postgres
    19:50 – Favorite Use Cases & Learning While Writing
    20:30 – When to Use Postgres for Non-Relational Workloads
    23:08 – Full Text Search in Postgres Explained
    29:31 – When Not to Use Postgres (Pragmatism Over Fanaticism)
    34:01 – Using Postgres as a Message Queue
    42:09 – When Message Queues Outgrow Postgres
    48:10 – Postgres for Generative AI (PGVector)
    55:34 – Dennis’ 14-Month Writing Process
    01:00:50 – Who the Book Is For
    01:04:10 – Where to Follow Dennis & Closing Thoughts

    Mehr anzeigen Weniger anzeigen
    1 Std. und 8 Min.
  • Strictly typed SQL with Contra CTO, Gajus Kuizinas
    Nov 20 2025

    In this episode, Gajus Kuizinas, co-founder and CTO of Contra, joins Aaron to talk about building the engineering world you want to live in, from strict runtime-validated SQL with Slonik to creating high-ownership engineering cultures. They dive into developer experience, runtime assertions, SafeQL, and even “Loom-driven development,” a powerful review process that lets teams move fast without breaking things.

    Follow Gajus:
    Twitter/X: https://twitter.com/kuizinas
    Slonk: https://github.com/gajus/slonik
    Scaling article: https://gajus.medium.com/lessons-learned-scaling-postgresql-database-to-1-2bn-records-month-edc5449b3067

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 – Introduction
    01:03 – Meet Gajus and Contra
    01:48 – What Contra does and how it’s different
    05:34 – Why Slonik exists & early career origins
    07:47 – The early Node.js era and frustrations with ORMs
    09:50 – SQL vs abstractions and the case for raw SQL
    10:35 – Template tags and the breakthrough idea
    12:03 – Strictness, catching errors early & data shape guarantees
    13:37 – Runtime type checking, Zod, and performance debates
    16:02 – SafeQL and real-time schema linting
    17:01 – Synthesizing Slonik’s philosophy
    21:29 – Handling drift, static types vs reality
    22:52 – Defining schemas per-query & why it matters
    27:59 – Integrating runtime types with large test suites
    31:00 – Scaling the team and performance tradeoffs
    33:41 – Runtime validation cost vs developer productivity
    35:21 – Real drift examples from payments & external APIs
    38:21 – User roles, data shape differences & edge cases
    39:51 – Integration test safety & catching issues pre-deploy
    40:52 – Contra’s engineering culture
    41:47 – Why traditional PR reviews don’t scale
    43:22 – Introducing Loom-Driven Development
    45:12 – How looms transformed the review process
    52:38 – Using GetDX to measure engineering friction
    53:07 – How the team uses AI (Claude, etc.)
    56:26 – Closing thoughts on DX and engineering philosophy
    58:05 – Contra needs Postgres experts
    59:00 – Where to find Gajus

    Mehr anzeigen Weniger anzeigen
    1 Std.
  • Building serverless vector search with Turbopuffer CEO, Simon Eskildsen
    Nov 13 2025

    In this episode, Aaron Francis talks with Simon Eskildsen, co-founder and CEO of TurboPuffer, about building a high-performance search engine and database that runs entirely on object storage. They dive deep on Simon's time as an engineer at Shopify, database design trade-offs, and how TurboPuffer powers modern AI workloads like Cursor and Notion.

    Follow Simon:
    Twitter: https://twitter.com/Sirupsen
    LinkedIn: https://ca.linkedin.com/in/sirupsen
    Turbopuffer: https://turbopuffer.com

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters
    00:00 - Introduction
    01:11 - Simon’s background and time at Shopify
    03:01 - The Rails glory days and early developer experiences
    04:55 - From PHP to Rails and joining Shopify
    06:14 - The viral blog post that led to Shopify
    09:03 - Discovering engineering talent through GitHub
    10:06 - Scaling Shopify’s infrastructure to millions of requests per second
    12:47 - Lessons from hypergrowth and burnout
    14:46 - Life after Shopify and “angel engineering”
    16:31 - The Readwise problem and discovering vector embeddings
    18:22 - The high cost of vector databases and napkin math
    19:14 - Building TurboPuffer on object storage
    21:20 - Landing Cursor as the first big customer
    23:00 - What TurboPuffer actually is
    25:26 - Why object storage now works for databases
    28:37 - How TurboPuffer stores and retrieves data
    31:06 - What’s inside those S3 files
    33:02 - Explaining vectors and embeddings
    35:55 - How TurboPuffer v1 handled search
    38:00 - Transitioning from search engine to database
    44:09 - How Turbopuffer v2 and v3 improved performance
    47:00 - Smart caching and architecture optimizations
    49:04 - Trade-offs: high write latency and cold queries
    51:03 - Cache warming and primitives
    52:25 - Comparing object storage providers (AWS, GCP, Azure)
    55:02 - Building a multi-cloud S3-compatible client
    57:11 - Who TurboPuffer serves and the scale it runs at
    59:31 - Connecting data to AI and the global vision
    1:00:15 - Company size, scale, and hiring
    1:01:36 - Roadmap and what’s next for TurboPuffer
    1:03:10 - Why you should (or shouldn’t) use TurboPuffer
    1:05:15 - Closing thoughts and where to find Simon

    Mehr anzeigen Weniger anzeigen
    1 Std. und 7 Min.
  • Building an S3 Competitor with Tigris CEO Ovais Tariq
    Nov 6 2025

    Aaron talks with Ovais Tariq, co-founder and CEO of Tigris Data and former Uber engineer who helped scale one of the world’s largest distributed systems. They discuss Uber’s hyperscale infrastructure, what it takes to build an S3-compatible object store from scratch, and how distributed storage is evolving for the AI era.

    Follow Ovais:
    Twitter: https://twitter.com/ovaistariq
    LinkedIn: https://www.linkedin.com/in/ovaistariq
    Tigris: https://www.tigrisdata.com

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 - Introduction and overview of the episode
    01:35 - Ovais’s background and introduction to Tigris
    03:00 - Building distributed databases and infrastructure at Uber
    06:00 - Uber’s in-house philosophy and massive data scale
    09:00 - Hardware, power density, and talking to chip manufacturers
    12:00 - Learning curve of scaling hardware and data centers
    14:00 - The Halloween outage and lessons from Cassandra
    16:00 - Building data centers across the world for Uber
    17:00 - Founding Tigris and the vision for global storage
    18:45 - How Tigris differs from AWS S3
    20:00 - The architecture of Tigris: caching, metadata, and replication
    32:00 - Why Tigris uses FoundationDB and its reliability
    36:00 - Managing global and regional metadata
    38:00 - How Tigris dynamically moves and caches data
    41:30 - Building their own data centers and backbone
    43:45 - Specialized storage for AI workloads
    46:00 - Small file optimization and real-world use cases
    49:00 - Snapshots, forking, and agentic AI workflows
    51:00 - How AI transformed Tigris’s customer base
    54:00 - Partnership with Fly.io and the distributed cloud ecosystem
    57:00 - Growth, customers, and focus on media and AI companies
    59:00 - What’s next for Tigris: distributed file system plans
    1:01:00 - Technical challenges and building trust in durability
    1:03:00 - Call to action: try Tigris and upcoming snapshot feature
    1:05:00 - Advice for engineers leaving big companies to start something new
    1:06:30 - Where to find Ovais online and closing remarks

    Mehr anzeigen Weniger anzeigen
    1 Std. und 7 Min.
  • Rewriting SQLite from prison with Preston Thorpe
    Oct 30 2025

    In this episode of Database School, Aaron talks with Preston Thorpe, a senior engineer at Turso who is currently incarcerated, about his incredible journey from prison to rewriting SQLite in Rust. They dive deep into concurrent writes, MVCC, and the challenges of building a new database from scratch while discussing redemption, resilience, and raw technical brilliance.

    Follow Preston and Turso:
    LinkedIn: https://www.linkedin.com/in/PThorpe92
    Preston's Blog: https://pthorpe92.dev
    GitHub: https://github.com/PThorpe92
    Turso: https://turso.tech

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 - Intro and Preston’s story
    02:13 - How Preston learned programming in prison
    06:06 - Making his parents proud and turning life around
    09:01 - Getting his first job at Unlock Labs
    10:47 - Discovering Turso and contributing to open source
    12:53 - From contributor to senior engineer at Turso
    22:27 - What Preston works on inside Turso
    24:00 - Challenges of rewriting SQLite in Rust
    26:00 - Why concurrent writes matter
    27:57 - How Turso implements concurrent writes
    35:02 - Maintaining SQLite compatibility
    37:03 - MVCC explained simply
    43:40 - How Turso handles MVCC and logging
    46:03 - Open source contributions and performance work
    46:23 - Implementing live materialized views
    50:55 - The DBSP paper and incremental computation
    52:55 - Sync and offline capabilities in Turso
    56:45 - Change data capture and future possibilities
    1:02:01 - Implementing foreign keys and fuzz testing
    1:06:02 - Rebuilding SQLite’s virtual machine
    1:08:10 - The quirks of SQLite’s codebase
    1:10:47 - Preston’s upcoming release and what’s next
    1:14:02 - Gratitude, reflection, and closing thoughts

    Mehr anzeigen Weniger anzeigen
    1 Std. und 18 Min.