How to Run a Computer Vision Workshop

2026·02·27 · 2 min

This week I ran a hands-on computer vision workshop for the Coding & Robotics Society at the University of Liverpool.

Who I Was Teaching

The challenge with a society workshop is that you can't assume much. This was mostly beginners, some had touched Python, a few hadn't. Even fewer had experience with computer vision.

The Structure

1. OpenCV: Image matrices, colour spaces, edge detection, live webcam filters.
2. YOLO: Real-time object detection, tracking, pose estimation.
3. SAM 2 + CLIP: Foundation models, zero-shot classification, combining models into a pipeline.
4. Advanced: Text-prompted segmentation, CLIP fine-tuning, video tracking.

The progression was deliberate. Start with raw pixels and build intuition for what an image actually is before introducing anything that feels like magic. By the time you get to SAM 2 auto-masking anything in a webcam feed, you have enough context to understand roughly why it works.

On Setup

Environment setup is the silent killer of workshops. The first time I ran it, it was. Python version mismatches, venv issues, Windows being Windows. It ate into the session before anyone had written a line of code.

The fix was switching to uv with a pinned Python 3.12. I've since updated the README in the GitHub repository with that as the default, and fallbacks for people who can't use it. If you're running something similar, standardise your environment tooling before anything else.

What Worked

The webcam exercises landed well. There's something about seeing your own face with a real-time edge detection filter applied that makes the theory click in a way a static image doesn't.

The Materials

Everything is public on my GitHub here. If you're running a similar session for a society or class, feel free to use or adapt it.