Jeffrey Hicks

Jeffrey Hicks

Platform Eng @R360

Chat with YouTube Videos - RAG over Channel Transcripts

Award-winning RAG system for conversational video analysis with transcript processing, speaker extraction, and Smithery MCP integration

By krankos • Aug 12, 2025
krankos/chat-ytchannel
Public repository
TypeScript
0 0
📅 Aug 12, 2025

This Chat with YouTube Videos project represents a standout entry from the Mastra Templates Hackathon, demonstrating sophisticated RAG (Retrieval-Augmented Generation) techniques for conversational video analysis.

Hackathon Recognition

Best Use of Tool Provider 🏆 - Judged by Arcade, recognizing exceptional tool integration and provider utilization.

Why This Project Won

Real Need Solution: Judges emphasized this “solves a real need for long-form content” - enabling users to extract insights from lengthy videos without watching them entirely

Clean RAG Pattern: Praised for demonstrating “clean pattern combining RAG + MCP tools” that other developers can easily understand and adapt

Efficiency Features: Smart processing that “checks if videos are already processed” to avoid duplicate work and unnecessary API costs

Easy Retargeting: Judges noted it’s “easy to retarget to any channel ID” making it a flexible template for various use cases

Technical Architecture

RAG Pipeline Implementation

The system demonstrates a sophisticated content processing pipeline:

Video Download: Automated fetching of YouTube video content using Smithery MCP integration Transcript Generation: Deepgram transcription with speaker identification and timing Content Chunking: Intelligent segmentation of transcripts with topic and speaker boundaries Vector Embedding: PostgreSQL with pgvector for semantic similarity search Conversational Interface: Natural language queries over processed video content

Demonstrated Capabilities

The judges observed Khalil’s live demo showing:

Pre-processing Intelligence: Verifies if video already processed to avoid duplicate work and API costs Audio Processing Pipeline: Downloads video and sends to Deepgram for high-quality transcription Data Extraction: Pulls speakers, topics, and metadata from transcript content RAG Implementation: Chunks and embeds content for semantic vector search Keyword Enhancement: Improves transcription accuracy with domain-relevant keywords Real-World Query Success: Successfully answered “tell me about the Mastra template hackathon” using their own video content Multi-Source Results: Retrieved both kickoff stream and templates workshop videos in response

Technical Stack

Core Technologies

Database: PostgreSQL with pgvector extension for vector similarity search Transcription: Deepgram API for high-quality speech-to-text with speaker diarization AI Integration: OpenAI for embeddings and conversational responses Tool Provider: Smithery MCP for YouTube API integration Framework: Mastra for workflow orchestration and tool coordination

MCP Integration Excellence

Smithery MCP Usage: Leverages Smithery’s YouTube connector for reliable video access Tool Composition: Demonstrates how MCP tools can be composed with custom processing logic Provider Abstraction: Shows how different tool providers can be integrated seamlessly

Judge Feedback from Demo

Shane Thomas (Co-founder)

Personal Interest: Shane revealed strong personal motivation for this type of tool:

“I try to build a YouTube transcript basically interact with an agent that will interact with YouTube videos, YouTube transcripts because we have this two-hour live stream every week and we have at this point you know 50 plus I would say almost we’re close to 100 hours of like video content… being able to have that is data and almost just like chat with the transcripts was a really cool idea that I wanted.”

Technical Quality: Praised the architecture balance:

“Overall not overly complex but very useful because it does actually show it’s a good use of RAG and a good use of MCP a good use of agents”

Future Vision: Expressed desire to implement this for their own content:

“Grand vision I would like to have you know chat with our transcripts of YouTube videos on the master website someday I will build it or someone on the team if they beat me to it which they normally do will build it but that would be pretty cool.”

Meta Demo Appeal: Found the demonstration “so meta” - showing a past video of their stream within the current stream

Sharita (Co-host)

Efficiency Appreciation: Specifically loved the optimization features:

“It checks to see if the video has been already transcribed before. Like just a very thoughtful touch, being mindful of compute”

Smart Architecture: Recognized the intelligence in preventing duplicate processing work

Smithery MCP Recognition

Shane gave explicit recognition to the tool provider integration: “shout out Smithery sponsor” - acknowledging how the Smithery MCP enabled seamless YouTube functionality

Architectural Insights

RAG Best Practices

Content Preprocessing: Demonstrates importance of intelligent chunking and speaker tracking Efficiency Optimization: Shows how to build cost-effective systems that avoid unnecessary processing Tool Integration: Exemplifies clean patterns for combining multiple external services Scalable Design: Architecture that can handle growing content libraries efficiently

MCP Tool Pattern

This project showcases how to effectively use Model Context Protocol (MCP) tools:

Provider Selection: Choosing the right tool provider (Smithery) for specific needs Tool Composition: Combining MCP tools with custom processing logic Data Flow: Managing data between external tools and internal processing systems

Production Readiness

Error Handling: Robust processing that gracefully handles video access issues Cost Management: Smart duplicate detection prevents unnecessary API usage Scalability: Database design that supports growing video libraries Extensibility: Clean architecture for adding new channels and features

This project demonstrates how RAG systems can move beyond simple document chat to sophisticated content analysis platforms that provide real value for users working with video content. The combination of smart processing, tool integration, and practical features makes it an excellent template for building production RAG applications.

Related

#mastra