Architecture Overview ===================== This document describes the architecture and design of the Cat Viewer system. System Components ----------------- The system consists of four main modules: Camera Module (camera.py) ~~~~~~~~~~~~~~~~~~~~~~~~~~ The Camera class wraps the picamera2 library to provide a high-level interface for Raspberry Pi camera modules. Key responsibilities: * Camera initialization and configuration * Automatic full sensor resolution detection * Frame capture at configured FPS * Frame format conversion (RGB/RGBA to BGR for OpenCV) * Camera validation and error handling * Support for IR camera controls (exposure, gain) The camera automatically detects and uses the full sensor resolution (e.g., 3280x2464 for Camera Module 2) instead of cropping to a smaller resolution, ensuring the complete field of view is captured. Detection Module (detector.py) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The CatDetector class wraps the Ultralytics YOLOv8 model for cat detection: * Loads pre-trained YOLOv8n (nano) model * Runs inference on video frames * Filters detections to only cats (COCO class ID 15) * Returns Detection objects with bounding boxes and confidence scores The detector is optimized for CPU inference on Raspberry Pi using the lightweight nano model variant. Recording Module (recorder.py) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The VideoRecorder class manages video recording sessions: * Creates timestamped output files * Manages recording state (start/stop) * Controls recording duration * Handles frame writing with automatic resizing * Uses OpenCV VideoWriter with mp4v codec Configuration Module (config.py) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The config module handles YAML configuration loading: * Loads settings from config.yaml * Provides getter functions for all settings * Falls back to hardcoded defaults if YAML is missing * Supports path expansion (home directory, absolute paths) Main Application (main.py) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The main module orchestrates the system: * Parses command-line arguments * Initializes camera, detector, and recorder * Runs the main monitoring loop * Handles cat detection and recording triggers * Manages preview window (optional) * Provides test recording mode Data Flow --------- 1. **Initialization**: - Load configuration from YAML (with CLI overrides) - Initialize camera, detector, and recorder - Validate camera can read frames 2. **Monitoring Loop**: - Read frame from camera - Run detection on frame - If cat detected and not recording: start recording - If recording: write frame to video file - If recording duration elapsed: stop recording - If preview enabled: draw detections and display 3. **Frame Processing**: - Camera captures frame in RGB/RGBA format - Frame converted to BGR for OpenCV compatibility - Frame passed to detector (YOLOv8 expects RGB, handles conversion) - Detections drawn on frame for preview - Frame written to video file if recording Configuration ------------- Configuration is loaded from config.yaml with the following structure: * **detection**: Threshold and model settings * **camera**: Device, resolution, FPS, optional IR controls * **recording**: Duration and output directory * **model**: Model file name and cat class ID Command-line arguments override YAML values, allowing runtime customization without editing config files. Error Handling -------------- The system includes comprehensive error handling: * Camera initialization failures with detailed troubleshooting messages * Camera validation to ensure frames can be read * Frame read failures with throttled logging and recovery * Preview window failures (graceful degradation, continues recording) * Detection errors (logged, system continues) * Recording errors (logged, system continues) The system is designed to be robust and continue operating even when some components encounter errors (e.g., preview window unavailable over SSH). Performance Considerations -------------------------- * **Model Size**: Uses YOLOv8n (nano) for CPU-friendly inference * **Input Resolution**: Configurable model input size (default 640x640) * **Frame Rate**: Configurable FPS (default 10) balances detection quality and CPU load * **Full Sensor Resolution**: Automatically detected and used for maximum field of view * **Recording Codec**: Uses mp4v codec for Raspberry Pi compatibility The system is optimized for Raspberry Pi 4 with CPU-only inference. GPU acceleration is not currently supported but could be added for improved performance.