<<Download>> Download Microsoft Word Course Outline Icon Word Version Download PDF Course Outline Icon PDF Version

Retrieval-Augmented Code Generation: Grounding Agents in Your Codebase

Class Duration

14 hours of live training delivered over 2-3 days to accommodate your scheduling needs.

Student Prerequisites

  • Professional software development experience in Python or TypeScript
  • Basic familiarity with vector search concepts is helpful but not required

Target Audience

Software engineers and ML engineers building internal AI coding tools, developer assistants, or code search systems on top of large private codebases. Also relevant for teams extending GitHub Copilot or Claude Code with organization-specific context via MCP or RAG pipelines.

Description

Off-the-shelf AI coding assistants know open-source code but not your private codebase. This course covers the architectures and techniques for grounding AI agents in proprietary code: code-specific embedding models, AST-aware chunking strategies, hybrid semantic/keyword search, reranking, retrieval pipeline design, evaluation, incremental indexing for evolving repositories, and end-to-end integration with AI coding agents. Labs build a working code retrieval pipeline against a realistic multi-language monorepo, evaluate it against a hand-built golden set, surface it as an MCP tool to an AI coding agent, and wire up incremental re-indexing in CI. The course is the code-specific complement to Production RAG Systems — it goes deep on the issues unique to grounding agents in code rather than documents.

Learning Outcomes

  • Explain the RAG architecture applied to code retrieval: chunking, embedding, indexing, retrieval, reranking, and augmentation, and where it differs from document RAG.
  • Select and apply code-aware embedding models and AST-aware chunking strategies (tree-sitter, function-level, contextual).
  • Build a vector index for a codebase using at least one vector database (pgvector, Qdrant, or equivalent).
  • Implement hybrid search combining dense embedding retrieval with BM25 keyword search and tune Reciprocal Rank Fusion weights.
  • Apply a reranking model to improve retrieval precision for code queries.
  • Build a hand-curated golden dataset and evaluate retrieval quality using Recall@K, Precision@K, MRR, and nDCG.
  • Diagnose retrieval failure modes and apply targeted fixes (chunk boundaries, query rewriting, metadata filters).
  • Operate a retrieval pipeline at production quality: incremental re-indexing in CI, embedding model migrations, and observability.
  • Expose a code retrieval pipeline as an MCP tool for Claude Code or GitHub Copilot and validate the end-to-end agent-grounded workflow.

Training Materials

Comprehensive courseware is distributed online at the start of class. All students receive a downloadable MP4 recording of the training.

Software Requirements

Python 3.12+, Docker (for running local vector DBs — pgvector and Qdrant), Git, and the sample multi-language monorepo provided for labs. Embedding and rerank API access (Voyage, Cohere, or OpenAI) is provided for the duration of the course.

Training Topics

RAG for Code: Architecture Overview
  • Why plain context windows aren't enough for large codebases
  • RAG pipeline stages: chunk, embed, index, retrieve, rerank, augment
  • Code-specific challenges vs. document RAG: symbols, cross-file references, naming, and code drift
  • Reference architectures and where code-RAG fits in agent workflows
Code Chunking Strategies (with hands-on)
  • Naive chunking: line-based and token-based, and why they fail for code
  • AST-aware chunking with tree-sitter across multiple languages
  • Function, class, and module-level chunks
  • Late chunking and Contextual Retrieval applied to code (per-chunk context summaries for cross-file references)
  • Cross-file dependency context preservation: imports, callers, type references
  • Hands-on lab: implement two chunking strategies on the sample monorepo and compare retrieval quality
Embedding Models for Code
  • General-purpose vs. code-specific embedding models
  • Voyage Code, OpenAI text-embedding-3, Cohere for code, and current open-source options
  • Multi-language embedding considerations
  • Embedding quality evaluation and how to choose
  • Hands-on lab: benchmark two embedding models against a small golden set
Vector Indexing
  • pgvector, Qdrant, Chroma, and Pinecone — concrete tradeoffs
  • HNSW index configuration, ef_search/ef_construction tuning, and recall tradeoffs
  • Metadata fields for code: file path, language, symbol, version, last-modified
  • Hands-on lab: stand up pgvector and Qdrant indexes for the same dataset and compare
  • BM25 keyword search alongside dense retrieval
  • Reciprocal Rank Fusion (RRF) for result merging and weight tuning
  • When hybrid search outperforms dense-only — and when it doesn't
  • Query rewriting and symbol-aware query expansion
  • Hands-on lab: enable hybrid search and tune RRF weights against the golden set
Reranking
  • Cross-encoder rerankers for code retrieval
  • Cohere Rerank, Voyage Rerank, and local alternatives
  • Latency vs. quality tradeoffs and when reranking pays for itself
  • Hands-on lab: add a reranker and measure quality and latency impact
Retrieval Pipeline Evaluation
  • Recall@K, Precision@K, MRR, and nDCG — what each measures and when to use it
  • Building a golden dataset for code retrieval evals (queries, expected files, expected symbols)
  • LLM-assisted golden-set construction and validation
  • Failure-mode analysis: what the retriever misses and why
  • Regression testing for retrieval quality across changes
  • Hands-on lab: build a golden set and run a full eval suite
Production Operations
  • Incremental indexing for evolving codebases (commit-level, branch-aware)
  • Re-indexing in CI: cost, latency, and freshness tradeoffs
  • Embedding model migrations and dual-index strategies
  • Observability: query latency, hit rate, retrieval-to-generation correlation
  • Cost controls: caching, embedding budgets, and tier-based retrieval
  • Hands-on lab: wire up incremental re-indexing on commit
Exposing Retrieval as an AI Tool
  • MCP tool wrapping a retrieval pipeline
  • Context packaging: file path, symbol, snippet metadata, and ranking signals
  • Truncation, chunk merging, and budget-aware result selection
  • Integrating with Claude Code and GitHub Copilot
  • Validating end-to-end: does grounded retrieval improve agent task performance?
  • Hands-on lab: ship the retrieval pipeline as an MCP tool and use it from Claude Code
Capstone Workshop
  • Build, evaluate, and integrate a complete code retrieval pipeline against the sample monorepo
  • Present results: chosen architecture, eval scores, and end-to-end agent demo
  • Group critique and Q&A
  • Recommended follow-on path: Production RAG Systems for general document RAG depth, Building Custom Tools and Skills for richer MCP tool design
<<Download>> Download Microsoft Word Course Outline Icon Word Version Download PDF Course Outline Icon PDF Version