Building Custom Tools and Skills for AI Coding Agents

Class Duration

7 hours of live training delivered over 1-2 days to accommodate your scheduling needs.

Student Prerequisites

Professional software development experience in TypeScript or Python
Familiarity with REST APIs and JSON Schema

Target Audience

Software engineers and platform engineers who want to extend AI coding agents—Claude Code, GitHub Copilot via MCP, or LLM-powered internal tools—with custom capabilities, safe execution sandboxes, and tool pipelines tailored to their organization's systems.

Description

This course covers the full lifecycle of building reliable custom tools for AI agents: from tool design and schema authoring through safe execution, error handling, and multi-tool composition. We treat tool-building as software engineering—with correctness, security, and maintainability as primary concerns—not just as LLM plumbing. Topics include function/tool calling mechanics across major model APIs, designing tool interfaces that models use reliably, safe execution sandboxes for code-running tools, and composition patterns for multi-step tool pipelines.

Learning Outcomes

Describe how function/tool calling works in Claude, OpenAI, and Gemini APIs.
Design tool schemas (name, description, parameter JSON Schema) that models invoke accurately.
Implement tool handlers with proper validation, error propagation, and structured responses.
Build safe execution sandboxes for tools that run arbitrary code or execute system commands.
Compose multi-tool pipelines with appropriate sequencing and state passing.
Apply security best practices: sandboxing, input sanitization, capability scoping, and audit logging.
Package and distribute custom tools as MCP servers or agent plugin modules.

Training Materials

Comprehensive courseware is distributed online at the start of class. All students receive a downloadable MP4 recording of the training.

Software Requirements

Python 3.12+ or Node.js 20+, API access to at least one frontier model, and Docker (for sandbox labs).

Training Topics

Function/Tool Calling Mechanics

How tool calling works at the API level
Tool call request and result round-trip
Parallel tool calls and sequential dependencies
Differences across Claude, OpenAI, and Gemini APIs

Tool Schema Design

Name and description as model-facing interface
JSON Schema for parameters: types, enums, required fields
Writing descriptions that models use correctly
Anti-patterns: over-broad schemas, ambiguous names

Tool Handler Implementation

Validation and error propagation patterns
Structured success and error responses
Idempotency and retry safety
Logging and observability for tool invocations

Safe Execution Sandboxes

Risk categories: file system, network, process execution
Docker-based isolation for code-running tools
Resource limits: CPU, memory, and timeout enforcement
Capability scoping: minimal permission principle

Multi-Tool Composition

Sequential pipelines and data passing
Conditional branching based on tool results
Aggregating results from parallel tool calls
Tool pipeline debugging and tracing

Security and Governance

Input sanitization for command-execution tools
Preventing prompt injection through tool outputs
Audit logging: who invoked what, when, with what args
Tool approval workflows for sensitive capabilities

Packaging and Distribution

MCP server packaging for Claude and Copilot
Plugin modules for internal agent frameworks
Versioning and compatibility management

Workshop

Design and implement a custom tool from spec
Sandbox exercise: code execution tool with limits
Q&A session