The Memriq AI Inference Brief – Engineering Edition Titelbild

The Memriq AI Inference Brief – Engineering Edition

The Memriq AI Inference Brief – Engineering Edition

Von: Keith Bourne
Jetzt kostenlos hören, ohne Abo

Nur 0,99 € pro Monat für die ersten 3 Monate

Danach 9.95 € pro Monat. Bedingungen gelten.

Über diesen Titel

The Memriq AI Inference Brief – Engineering Edition is a weekly deep dive into the technical guts of modern AI systems: retrieval-augmented generation (RAG), vector databases, knowledge graphs, agents, memory systems, and more. A rotating panel of AI engineers and data scientists breaks down architectures, frameworks, and patterns from real-world projects so you can ship more intelligent systems, faster.Copyright 2025 Memriq AI Persönliche Entwicklung Persönlicher Erfolg Politik & Regierungen
  • Evaluating Agentic AI: DeepEval, RAGAS & TruLens Frameworks Compared
    Jan 5 2026

    # Evaluating Agentic AI: DeepEval, RAGAS & TruLens Frameworks Compared

    In this episode of Memriq Inference Digest - Engineering Edition, we explore the cutting-edge evaluation frameworks designed for agentic AI systems. Dive into the strengths and trade-offs of DeepEval, RAGAS, and TruLens as we unpack how they address multi-step agent evaluation challenges, production readiness, and integration with popular AI toolkits.

    In this episode:

    - Compare DeepEval’s extensive agent-specific metrics and pytest-native integration for development testing

    - Understand RAGAS’s knowledge graph-powered synthetic test generation that slashes test creation time by 90%

    - Discover TruLens’s production-grade observability with hallucination detection via the RAG Triad framework

    - Discuss hybrid evaluation strategies combining these frameworks across the AI lifecycle

    - Learn about real-world deployments in fintech, e-commerce, and enterprise conversational AI

    - Hear expert insights from Keith Bourne on calibration and industry trends

    Key tools & technologies mentioned:

    DeepEval, RAGAS, TruLens, LangChain, LlamaIndex, LangGraph, OpenTelemetry, Snowflake, Datadog, Cortex AI, DeepTeam

    Timestamps:

    00:00 - Introduction to agentic AI evaluation frameworks

    03:00 - Key metrics and evaluation challenges

    06:30 - Framework architectures and integration

    10:00 - Head-to-head comparison and use cases

    14:00 - Deep technical overview of each framework

    17:30 - Real-world deployments and best practices

    19:30 - Open problems and future directions

    Resources:

    1. "Unlocking Data with Generative AI and RAG" by Keith Bourne - Search for 'Keith Bourne' on Amazon and grab the 2nd edition
    2. This podcast is brought to you by Memriq.ai - AI consultancy and content studio building tools and resources for AI practitioners.

    Mehr anzeigen Weniger anzeigen
    20 Min.
  • Model Context Protocol: The Universal AI Integration Standard Explained
    Dec 15 2025

    Discover how the Model Context Protocol (MCP) is revolutionizing AI systems integration by simplifying complex multi-tool interactions into a scalable, open standard. In this episode, we unpack MCP’s architecture, adoption by industry leaders, and its impact on engineering workflows.

    In this episode:

    - What MCP is and why it matters for AI/ML engineers and infrastructure teams

    - The M×N integration problem and how MCP reduces it to M+N

    - Core primitives: Tools, Resources, and Prompts, and their roles in MCP

    - Technical deep dive into JSON-RPC 2.0 messaging, transports, and security with OAuth 2.1 + PKCE

    - Comparison of MCP with OpenAI Function Calling, LangChain, and custom REST APIs

    - Real-world adoption, performance metrics, and engineering trade-offs

    - Open challenges including security, authentication, and operational complexity

    Key tools & technologies mentioned:

    - Model Context Protocol (MCP)

    - JSON-RPC 2.0

    - OAuth 2.1 with PKCE

    - FastMCP Python SDK, MCP TypeScript SDK

    - agentgateway by Solo.io

    - OpenAI Function Calling

    - LangChain

    Timestamps:

    00:00 — Introduction to MCP and episode overview

    02:30 — The M×N integration problem and MCP’s solution

    05:15 — Why MCP adoption is accelerating

    07:00 — MCP architecture and core primitives explained

    10:00 — Head-to-head comparison with alternatives

    12:30 — Under the hood: protocol mechanics and transports

    15:00 — Real-world impact and usage metrics

    17:30 — Challenges and security considerations

    19:00 — Closing thoughts and future outlook

    Resources:

    • "Unlocking Data with Generative AI and RAG" by Keith Bourne - Search for 'Keith Bourne' on Amazon and grab the 2nd edition
    • This podcast is brought to you by Memriq.ai - AI consultancy and content studio building tools and resources for AI practitioners.

    Mehr anzeigen Weniger anzeigen
    20 Min.
  • RAG Evaluation with ragas: Reference-Free Metrics & Monitoring
    Dec 14 2025

    Unlock the secrets to evaluating Retrieval-Augmented Generation (RAG) pipelines effectively and efficiently with ragas, the open-source framework that’s transforming AI quality assurance. In this episode, we explore how to implement reference-free evaluation, integrate continuous monitoring into your AI workflows, and optimize for production scale — all through the lens of Keith Bourne’s comprehensive Chapter 9.

    In this episode:

    - Overview of ragas and its reference-free metrics that achieve 95% human agreement on faithfulness scoring

    - Implementation patterns and code walkthroughs for integrating ragas with LangChain, LlamaIndex, and CI/CD pipelines

    - Production monitoring architecture: sampling, async evaluation, aggregation, and alerting

    - Comparison of ragas with other evaluation frameworks like DeepEval and TruLens

    - Strategies for cost optimization and asynchronous evaluation at scale

    - Advanced features: custom domain-specific metrics with AspectCritic and multi-turn evaluation support

    Key tools and technologies mentioned:

    - ragas (Retrieval Augmented Generation Assessment System)

    - LangChain, LlamaIndex

    - LangSmith, LangFuse (observability and evaluation tools)

    - OpenAI GPT-4o, GPT-3.5-turbo, Anthropic Claude, Google Gemini, Ollama

    - Python datasets library

    Timestamps:

    00:00 - Introduction and overview with Keith Bourne

    03:00 - Why reference-free evaluation matters and ragas’s approach

    06:30 - Core metrics: faithfulness, answer relevancy, context precision & recall

    09:00 - Code walkthrough: installation, dataset structure, evaluation calls

    12:00 - Integrations with LangChain, LlamaIndex, and CI/CD workflows

    14:30 - Production monitoring architecture and cost considerations

    17:00 - Advanced metrics and custom domain-specific evaluations

    19:00 - Common pitfalls and testing strategies

    20:30 - Closing thoughts and next steps

    Resources:

    - "Unlocking Data with Generative AI and RAG" by Keith Bourne - Search for 'Keith Bourne' on Amazon and grab the 2nd edition

    - Memriq AI: https://Memriq.ai

    - ragas website: https://www.ragas.io/

    - ragas GitHub repository: https://github.com/vibrantlabsai/ragas (for direct access to code and docs)

    Tune in to build more reliable, scalable, and maintainable RAG systems with confidence using open-source evaluation best practices.

    Mehr anzeigen Weniger anzeigen
    27 Min.
Noch keine Rezensionen vorhanden