Architecture Overview

This document describes the architecture and design of the Cat Viewer system.

System Components

The system consists of four main modules:

Camera Module (camera.py)

The Camera class wraps the picamera2 library to provide a high-level interface for Raspberry Pi camera modules. Key responsibilities:

Camera initialization and configuration
Automatic full sensor resolution detection
Frame capture at configured FPS
Frame format conversion (RGB/RGBA to BGR for OpenCV)
Camera validation and error handling
Support for IR camera controls (exposure, gain)

The camera automatically detects and uses the full sensor resolution (e.g., 3280x2464 for Camera Module 2) instead of cropping to a smaller resolution, ensuring the complete field of view is captured.

Detection Module (detector.py)

The CatDetector class wraps the Ultralytics YOLOv8 model for cat detection:

Loads pre-trained YOLOv8n (nano) model
Runs inference on video frames
Filters detections to only cats (COCO class ID 15)
Returns Detection objects with bounding boxes and confidence scores

The detector is optimized for CPU inference on Raspberry Pi using the lightweight nano model variant.

Recording Module (recorder.py)

The VideoRecorder class manages video recording sessions:

Creates timestamped output files
Manages recording state (start/stop)
Controls recording duration
Handles frame writing with automatic resizing
Uses OpenCV VideoWriter with mp4v codec

Configuration Module (config.py)

The config module handles YAML configuration loading:

Loads settings from config.yaml
Provides getter functions for all settings
Falls back to hardcoded defaults if YAML is missing
Supports path expansion (home directory, absolute paths)

Main Application (main.py)

The main module orchestrates the system:

Parses command-line arguments
Initializes camera, detector, and recorder
Runs the main monitoring loop
Handles cat detection and recording triggers
Manages preview window (optional)
Provides test recording mode

Data Flow

Initialization: - Load configuration from YAML (with CLI overrides) - Initialize camera, detector, and recorder - Validate camera can read frames
Monitoring Loop: - Read frame from camera - Run detection on frame - If cat detected and not recording: start recording - If recording: write frame to video file - If recording duration elapsed: stop recording - If preview enabled: draw detections and display
Frame Processing: - Camera captures frame in RGB/RGBA format - Frame converted to BGR for OpenCV compatibility - Frame passed to detector (YOLOv8 expects RGB, handles conversion) - Detections drawn on frame for preview - Frame written to video file if recording

Configuration

Configuration is loaded from config.yaml with the following structure:

detection: Threshold and model settings
camera: Device, resolution, FPS, optional IR controls
recording: Duration and output directory
model: Model file name and cat class ID

Command-line arguments override YAML values, allowing runtime customization without editing config files.

Error Handling

The system includes comprehensive error handling:

Camera initialization failures with detailed troubleshooting messages
Camera validation to ensure frames can be read
Frame read failures with throttled logging and recovery
Preview window failures (graceful degradation, continues recording)
Detection errors (logged, system continues)
Recording errors (logged, system continues)

The system is designed to be robust and continue operating even when some components encounter errors (e.g., preview window unavailable over SSH).

Performance Considerations

Model Size: Uses YOLOv8n (nano) for CPU-friendly inference
Input Resolution: Configurable model input size (default 640x640)
Frame Rate: Configurable FPS (default 10) balances detection quality and CPU load
Full Sensor Resolution: Automatically detected and used for maximum field of view
Recording Codec: Uses mp4v codec for Raspberry Pi compatibility

The system is optimized for Raspberry Pi 4 with CPU-only inference. GPU acceleration is not currently supported but could be added for improved performance.