Database School Titelbild

Database School

Database School

Von: Try Hard Studios
Jetzt kostenlos hören, ohne Abo

Über diesen Titel

Join database educator Aaron Francis as he gets schooled by database professionals.© 2026 Try Hard Studios
  • Infinite, shareable volume storage with Hunter Leath, Archil CEO
    Jan 15 2026

    Hunter Leath, CEO of Archil, explains how they’re building a “universal storage engine” that sits between your apps and S3—making an S3 bucket behave like a fast, POSIX-compatible disk for containers, servers, and even Lambda. Along the way, we dig into how their SSD-backed clusters and custom protocol avoid the usual small-file pain and where this approach shines (and where it doesn’t).

    Follow Hunter:
    Twitter/X: https://twitter.com/jhleath
    Archil Twitter/X: https://twitter.com/archildata
    Archil: https://archil.com/

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 - Intro: Archil Data and “S3 as a disk”
    01:05 - Hunter’s background and the core pitch
    02:32 - The real problem: state management (S3 vs block storage)
    05:02 - SQLite on S3: what the stack looks like
    07:13 - The missing layer: durable SSD-backed clusters
    10:14 - Who uses this: unstructured data, CI/CD, Git, agents
    12:15 - Small files + Git performance and avoiding S3 request explosion
    16:22 - Why they built a new protocol (NFS vs Luster)
    20:00 - What gets written to S3: real files in your bucket
    22:29 - S3 limits, throttling, and the “keep it on SSD” escape hatch
    25:32 - Multi-cloud + R2, and why regions/latency matter
    32:10 - Pricing model: “pay only when data is active”
    34:41 - Tradeoffs: random reads and ultra-low-latency metal
    37:19 - Storage/compute separation and AI/agent-native workflows
    43:21 - YC timeline + the marketing challenge of a “universal layer”
    47:34 - Single-tenant clusters for enterprises and why it’s hard
    50:27 - Where the company is now, hiring, and how to try it (disk.new)

    Mehr anzeigen Weniger anzeigen
    55 Min.
  • Building search for AI systems with Chroma CTO Hammad Bashir
    Dec 18 2025

    Hammad Bashir, CTO of Chroma, joins the show to break down how modern vector search systems are actually built from local, embedded databases to massively distributed, object-storage-backed architectures. We dig into Chroma’s shared local-to-cloud API, log-structured storage on object stores, hybrid search, and why retrieval-augmented generation (RAG) isn’t going anywhere.

    Follow Hammad:
    Twitter/X: https://twitter.com/HammadTime
    LinkedIn: https://www.linkedin.com/in/hbashir
    Chroma: https://trychroma.com

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 – Introduction From high-school ASICs to CTO of Chroma
    01:04 – Hammad’s background and why vector search stuck
    03:01 – Why Chroma has one API for local and distributed systems
    05:37 – Local experimentation vs production AI workflows
    08:03 – What “unprincipled data” means in machine learning
    10:31 – From computer vision to retrieval for LLMs
    13:00 – Exploratory data analysis and why looking at data still matters
    16:38 – Promoting data from local to Chroma Cloud
    19:26 – Why Chroma is built on object storage
    20:27 – Write-ahead logs, batching, and durability
    26:56 – Compaction, inverted indexes, and storage layout
    29:26 – Strong consistency and reading from the log
    34:12 – How queries are routed and executed
    37:00 – Hybrid search: vectors, full-text, and metadata
    41:03 – Chunking, embeddings, and retrieval boundaries
    43:22 – Agentic search and letting models drive retrieval
    45:01 – Is RAG dead? A grounded explanation
    48:24 – Why context windows don’t replace search
    56:20 – Context rot and why retrieval reduces confusion
    01:00:19 – Faster models and the future of search stacks
    01:02:25 – Who Chroma is for and when it’s a great fit
    01:04:25 – Hiring, team culture, and where to follow Chroma

    Mehr anzeigen Weniger anzeigen
    1 Std. und 7 Min.
  • Scaling DuckDB in the cloud with MotherDuck CEO Jordan Tigani
    Dec 11 2025

    In this episode of Database School, Aaron Francis sits down with Jordan Tigani, co-founder and CEO of MotherDuck, to break down what DuckDB is, how MotherDuck hosts it in the cloud, and why analytics workloads are shifting toward embedded databases. They dig into Duck Lake, pricing models, scaling strategies, and what it really takes to build a modern cloud data warehouse.

    Follow Jordan:
    Twitter/X: https://twitter.com/jrdntgn
    LinkedIn: https://www.linkedin.com/in/jordantigani
    MotherDuck: https://motherduck.com

    Follow Aaron:
    Twitter/X: https://twitter.com/aarondfrancis
    Database School: https://databaseschool.com
    Database School YouTube Channel: https://www.youtube.com/@UCT3XN4RtcFhmrWl8tf_o49g (Subscribe today)
    LinkedIn: https://www.linkedin.com/in/aarondfrancis
    Website: https://aaronfrancis.com - find articles, podcasts, courses, and more.

    Chapters:
    00:00 - Introduction
    01:44 - What DuckDB is and why embedded analytics matter
    04:03 - How MotherDuck hosts DuckDB in the cloud
    05:18 - Is MotherDuck like the “Turso for DuckDB”?
    07:38 - Isolated analytics per user and scaling to zero
    08:51 - The academic origins of DuckDB
    10:00 - From SingleStore to founding MotherDuck
    12:28 - Getting fired… and funded 12 days later
    16:39 - Jordan’s background: Kernel dev, BigQuery, and Product
    18:36 - Partnering with DuckDB Labs and avoiding a fork
    20:52 - Why MotherDuck targets startups and the long tail
    24:22 - Pricing lessons: why $25 was too cheap
    28:11 - Ducklings, instance sizing, and compute scaling
    34:16 - How MotherDuck separates compute and storage
    37:09 - Inside the AWS architecture and differential storage
    43:12 - Hybrid execution: joining local and cloud data
    45:14 - Analytics vs warehouses vs operational databases
    47:41 - Data lakes, Iceberg, and what Duck Lake actually is
    53:22 - When Duck Lake makes more sense than DuckDB alone
    56:09 - Who switches to MotherDuck and why
    58:02 - PG DuckDB and offloading analytics from Postgres
    1:00:49 - Who should use MotherDuck and why
    1:03:39 - Hiring plans and where to follow Jordan
    1:05:01 - Wrap-up

    Mehr anzeigen Weniger anzeigen
    1 Std. und 5 Min.
Noch keine Rezensionen vorhanden