Updated June 2026Distributed Task Automation with Python Faust and Kafka
Class Duration
16 hours of live training delivered over 2-4 days.
Student Prerequisites
- Proficiency in Python programming, including experience with Python 3.x.
- Familiarity with basic concepts of distributed systems and task automation.
- Experience with Docker and containerization concepts is beneficial but not required.
- Basic understanding of message brokers and stream processing concepts is helpful but not required.
- All students should have taken the Python Task Automation course or have significant experience with the topics covered in the Python Task Automation course.
Target Audience
Experienced Python developers and programming professionals ready to scale automation and streamline backend architecture by building resilient, real-time distributed task pipelines with Python Faust and Apache Kafka. This course builds on the fundamentals from Task Automation with Python.
Description
This course equips Python developers with practical expertise in distributed task automation using Python Faust and Apache Kafka, building resilient, high-performance pipelines with Kafka for messaging and Faust for stream processing. You'll start by exploring the fundamentals of task automation before diving into hands-on environment setup: installing Python tools, containerizing your applications with Docker, and standing up a Kafka cluster. From there, you'll master Faust's powerful real-time stream processing API, learning to manage state, ensure fault tolerance, and handle errors gracefully. We'll then show you how to monitor your applications, tune performance, and scale seamlessly in production, with best practices for deployment and observability. Along the way, we'll tailor examples to your domain so you leave with immediately applicable skills for streamlining backend architecture and automating complex workflows at scale.
Learning Outcomes
- Understand the concept and application of Distributed Task Automation.
- Set up and configure a Python development environment for script programming.
- Learn the basics of containerization and how to use Docker for creating and running containers.
- Gain in-depth knowledge of Kafka, its architecture, and how to set up a Kafka cluster.
- Master the basics and advanced concepts of Python Faust, including agents, stream processing, state management, and fault tolerance.
- Learn how to monitor and manage Kafka and Faust applications, including error handling and retry logic.
- Understand the best practices for deploying Kafka and Faust in production, ensuring high availability and optimizing performance.
- Implement a real-time data pipeline with Faust and Kafka.
Training Materials
Comprehensive courseware is distributed online at the start of class. All students receive a downloadable MP4 recording of the training.
Software Requirements
Students will need a free, personal GitHub account to access the courseware. Student machines will need a text editor like Visual Studio Code, the latest Python version, Docker Desktop, PanDoc, and OpenOffice. Students will need permission to install NPM and PyPi packages as well as the ability to download Docker images. Preconfigured student virtual machines can be provided upon request.
Training Topics
Overview of Distributed Task Automation
- What is Distributed Task Automation?
- Overview of Python Faust (the maintained
faust-streaming fork; the original Robinhood Faust is no longer maintained) - Faust compared to Celery
- What is Streaming?
- What is Kafka?
- What is KRaft? (Kafka 4.x runs without ZooKeeper)
- Kafka compared to RabbitMQ + PostgreSQL
Development Environment
- Configure Visual Studio Code for Python Script Programming
- Python Code Linting & Reformatting with Ruff & MyPy
- Debugging Python Scripts with Visual Studio Code
- Docker Desktop
Containerization
- What is a Container?
- What is Docker?
- What is Docker Hub?
- Images and Containers
- Create an Image with Dockerfile
- Run Containers
- Configure Containers with Environment Variables
- Docker Compose
- Docker Compose Networking
- Docker Compose Volume
Scaling Faust Applications
- Parallelism and Partitioning in Kafka
- Running Multiple Faust Workers
Monitoring and Management
- Monitoring Kafka and Faust
- Using Kafka Monitoring Tools (e.g., Kafka UI, Confluent Control Center)
- Logging and Metrics in Faust
- Handling Errors and Retries
- Configuring Error Handling in Faust
- Implementing Retry Logic
Scaling and Deployment
- Deploying Kafka and Faust in Production
- Best Practices for Kafka Cluster Deployment
- Deploying Faust Apps with Docker and Kubernetes
- High Availability and Fault Tolerance
- Configuring Kafka for High Availability
- Ensuring High Availability in Faust Applications
- Kafka Performance Tuning
- Optimizing Faust Performance
- Implementing a Real-Time Data Pipeline with Faust and Kafka