<<Download>> Download Microsoft Word Course Outline Icon Word Version Download PDF Course Outline Icon PDF Version

Updated June 2026

Generative AI and LLMs for Python Programmers

Class Duration

35 hours of live training delivered over 5 days.

Student Prerequisites

  • Experience with Python is required.
  • No generative AI experience is required.

Target Audience

Python programmers and software engineers who want a comprehensive, end-to-end understanding of Generative AI and Large Language Models - from Transformer fundamentals and prompt engineering to fine-tuning, RAG, agentic patterns, and responsible deployment.

Description

This comprehensive course is the broad survey that takes Python developers from zero generative AI experience to a working command of the whole field as it stands in 2026. It builds the foundations first: the evolution of text generation from N-grams and RNNs to the Transformer architecture, attention and positional encoding, and how text generation actually works. Those foundations are then put to work against today's frontier models (the Claude 5 and 4.x, GPT-5.x, and Gemini 3.x families) through the Anthropic, OpenAI, and Gemini Python SDKs: prompt and context engineering, structured outputs, tool calling, multimodal inputs, embeddings, and RAG. The course then goes under the hood with pre-training, fine-tuning and PEFT, alignment via RLHF, and model evaluation, closing each major theme with the practical question that matters: which technique to reach for, and when. The final day covers running open-weight models locally, evals and observability, cost engineering, and the ethical, privacy, and security concerns of responsible AI practice. Students who want to go deeper afterward have two natural follow-ons: LLM Application Development with Python for production application engineering, and Building AI Agents with Python and MCP for autonomous multi-step systems.

Learning Outcomes

  • Understand the fundamentals of generative AI and LLMs, including the evolution from N-grams and RNNs to the Transformer architecture, attention mechanisms, and positional encoding.
  • Gain practical skills in text generation, prompt engineering, context engineering, and generative configuration.
  • Call frontier models (Claude 5 and 4.x, GPT-5.x, and Gemini 3.x) from Python with the Anthropic, OpenAI, and Gemini SDKs, including streaming, structured outputs with Pydantic, tool calling, and multimodal inputs.
  • Build semantic search with embeddings, and implement RAG with chunking, vector stores, hybrid search, and reranking.
  • Explore pre-training, domain adaptation, fine-tuning, PEFT, and alignment with human values, and choose confidently between prompting, RAG, and fine-tuning for a given problem.
  • Explain agentic patterns and the Model Context Protocol (MCP), and recognize when a problem calls for an agent.
  • Evaluate generative AI applications with evals and observability tooling, control costs with prompt caching, and run open-weight models locally with Ollama and vLLM.
  • Understand the ethical considerations, biases, privacy, and security concerns in generative AI, and navigate the full project lifecycle with responsible AI practices.

Training Materials

Comprehensive courseware is distributed online at the start of class. All students receive a downloadable MP4 recording of the training.

Software Requirements

Students will need a free, personal GitHub account to access the courseware. Student will need permission to install Docker Desktop, Python, Visual Studio Code, and Visual Studio Code Extensions on their computers.

Training Topics

Introduction to Generative AI & LLMs

  • Overview of Generative AI
  • Introduction to Large Language Models (LLMs)
  • Historical Perspective on Text Generation
  • Use Cases and Tasks for LLMs
  • How LLMs fit into the 2026 software landscape

Text Generation before Transformers

  • N-grams and Statistical Language Models
  • Recurrent Neural Networks (RNNs)
  • Long Short-Term Memory (LSTM) Networks
  • Limitations of Pre-Transformer Models

Transformer Architecture

  • Introduction to Transformer Models
  • Attention Mechanism
  • Encoder-Decoder Architecture
  • Self-Attention and Multi-Head Attention
  • Positional Encoding

Tokens, Sampling, and Generative Configuration

  • Tokenization and context windows
  • Text generation techniques: greedy decoding to controlled sampling
  • Beam Search, Sampling, and Top-k/Top-p Sampling
  • Temperature and inference configurations
  • Model hyperparameters, training configurations, and fine-tuning configurations
  • Practical examples of text generation

Provider Setup and SDK Fundamentals

  • Anthropic, OpenAI, and Gemini Python SDK setup
  • API keys, environments, and request/response anatomy
  • Integrating LLMs into applications
  • Streaming responses
  • Provider SDKs and the move toward standardized message APIs
  • Error handling and rate limits

Frontier Models in 2026

  • Claude 5 and 4.x families (Anthropic): Fable, Opus, Sonnet, Haiku
  • GPT-5.x family (OpenAI), incl. GPT-5.5 and GPT-5.2-Codex
  • Gemini 3.x family (Google)
  • Reasoning and extended thinking modes
  • Choosing models by task: capability, latency, cost
  • Open-weight alternatives and where they fit

Prompting and Prompt Engineering

  • Introduction to Prompt Engineering
  • Designing Effective Prompts
  • System prompts, roles, and few-shot examples
  • Helping LLMs reason and plan with Chain-of-Thought
  • Techniques for Prompt Optimization
  • Examples and Best Practices

Context Engineering

  • From prompt engineering to context engineering
  • What belongs in the context window - and what doesn't
  • Managing context budgets across long interactions
  • Summarization and compaction strategies
  • Grounding with retrieved and injected content

Structured Outputs with Pydantic

  • Structured outputs (JSON Schema-constrained generation)
  • Pydantic models as output contracts
  • Validation and error recovery for malformed responses
  • Extraction, classification, and routing use cases
  • When structure helps - and when it hurts quality

Tool Use and Function Calling

  • Interacting with external applications via tool use / function calling
  • Defining tools: schemas, names, and descriptions
  • The tool-call round trip: invoke, execute, return
  • Sequential and parallel tool calls
  • Program-Aided Language Models (PAL)

Multimodal Inputs

  • Vision: images as model input
  • Document understanding: PDFs and rich documents
  • Audio inputs and transcription workflows
  • Multimodal prompting patterns
  • Cost and latency considerations for multimodal calls
  • Vector embeddings and similarity search
  • Embedding models and choosing dimensions
  • Distance metrics and nearest-neighbor search
  • Semantic search vs. keyword search
  • Embeddings beyond search: clustering and classification

RAG Fundamentals

  • Retrieval-augmented generation: why and when
  • Chunking strategies and metadata
  • pgvector, Qdrant, Chroma, and other vector stores
  • Grounded prompting and citations
  • The naive-RAG baseline and its failure modes

RAG Quality: Hybrid Search, Reranking, and Evaluation

  • Hybrid search (BM25 + vector)
  • Reranking
  • RAG evaluation harnesses
  • Retrieval metrics: precision, recall, and groundedness
  • Iterating on retrieval quality systematically

Agentic Patterns and MCP

  • Agents vs. workflows
  • Tool use and the agent loop
  • ReAct: Combining Reasoning and Action
  • Model Context Protocol (MCP): standardized tool/server integrations
  • Multi-agent orchestration
  • Background agents and long-horizon tasks
  • Going deeper: Building AI Agents with Python and MCP

Generative AI Project Lifecycle

  • Project Planning and Scoping
  • Data Collection and Preprocessing
  • Model Selection and Training
  • Evaluation and Iteration
  • Deployment and Monitoring
  • LLM application architectures
  • Lifecycle cheat sheet: key steps, best practices, common pitfalls and solutions

Pre-training Large Language Models

  • Pre-training Objectives
  • Datasets for Pre-training
  • Computational Challenges
  • Scaling Laws and Compute-Optimal Models

Domain Adaptation, Fine-Tuning, and PEFT

  • Domain Adaptation Techniques
  • Instruction Fine-Tuning
  • Fine-Tuning on a Single Task
  • Multi-Task Instruction Fine-Tuning
  • Introduction to Parameter-Efficient Fine-Tuning (PEFT)
  • PEFT Techniques 1: LoRA (Low-Rank Adaptation)
  • PEFT Techniques 2: Soft Prompts

Aligning Models with Human Values

  • Introduction to Model Alignment
  • Reinforcement Learning from Human Feedback (RLHF)
  • Obtaining Feedback from Humans
  • Reward Model and Fine-Tuning with Reinforcement Learning
  • Addressing Reward Hacking
  • Scaling Human Feedback

Prompting vs. RAG vs. Fine-Tuning

  • A decision framework for adapting model behavior
  • When prompting and context engineering are enough
  • When retrieval beats fine-tuning - and vice versa
  • Combining approaches in one system
  • Cost, maintenance, and data requirements of each path

Model Optimization and Local Models

  • Model Compression Techniques
  • Quantization and Pruning
  • Optimizing Inference Performance
  • Deployment Strategies
  • Open-weight models and serving locally with Ollama and vLLM
  • Privacy, data residency, and restricted environments

Evals, Observability, and Cost Engineering

  • Evaluation Metrics for LLMs
  • Standard Benchmarks
  • Evaluating Model Performance
  • Application-level evals: golden datasets and LLM-as-judge
  • Evals and observability (Langfuse, Braintrust)
  • Cost engineering and prompt caching

Responsible AI and Keeping Current

  • Ethical Considerations in Generative AI
  • Bias and Fairness in LLMs
  • Privacy and Security Concerns
  • Prompt injection and emerging threat patterns
  • Developing Responsible AI Practices
  • Keeping current: tracking model releases and evaluating what matters
  • Where to go next: LLM Application Development with Python
<<Download>> Download Microsoft Word Course Outline Icon Word Version Download PDF Course Outline Icon PDF Version