SegmentationMask

A fixed-size header (64 bytes, zero-copy transport) describing a pixel-level segmentation mask. The mask data follows the header as a raw byte array where each pixel stores a class ID (semantic), instance ID (instance), or both (panoptic). Three static constructors select the segmentation mode.

When to Use

Use SegmentationMask when your robot runs a segmentation model and needs to label every pixel in an image. Common scenarios include driveable surface detection (semantic), grasping individual objects (instance), and full scene understanding (panoptic).

ROS2 Equivalent

No direct ROS2 equivalent. ROS2 typically publishes segmentation as sensor_msgs/Image with class IDs encoded as pixel values. HORUS provides a dedicated type with mode metadata.

Three Segmentation Modes

Mode	Value	Pixel Meaning	Use Case
Semantic	`0`	Class ID (0-255). Each class gets one color.	"What is this pixel?" -- road, sidewalk, sky
Instance	`1`	Instance ID (0-255). Each object gets a unique ID.	"Which object is this pixel?" -- person #1, person #2
Panoptic	`2`	Both class and instance encoded.	"What and which?" -- car #3, tree #7

Example

// simplified
use horus::prelude::*;

// Semantic segmentation: 80 COCO classes
let mask = SegmentationMask::semantic(640, 480, 80)
    .with_frame_id("camera_front");

// Instance segmentation: no class count needed
let mask = SegmentationMask::instance(640, 480);

// Panoptic segmentation: class + instance
let mask = SegmentationMask::panoptic(640, 480, 80);

// Query mask type
assert!(mask.is_panoptic());
assert!(!mask.is_semantic());

// Calculate data buffer size
let buf_size = mask.data_size(); // width * height bytes (u8 per pixel)

Fields

Field	Type	Unit	Size	Description
`width`	`u32`	px	4 B	Image width
`height`	`u32`	px	4 B	Image height
`num_classes`	`u32`	--	4 B	Number of classes (semantic/panoptic). `0` for instance mode
`mask_type`	`u32`	--	4 B	`0`=semantic, `1`=instance, `2`=panoptic
`timestamp_ns`	`u64`	ns	8 B	Timestamp in nanoseconds since epoch
`seq`	`u64`	--	8 B	Sequence number
`frame_id`	`[u8; 32]`	--	32 B	Coordinate frame (e.g., `"camera_front"`)

Total header size: 64 bytes (fixed-size, zero-copy)

Methods

Method	Signature	Description
`semantic(w, h, classes)`	`(u32, u32, u32) -> Self`	Create semantic mask header
`instance(w, h)`	`(u32, u32) -> Self`	Create instance mask header
`panoptic(w, h, classes)`	`(u32, u32, u32) -> Self`	Create panoptic mask header
`is_semantic()`	`-> bool`	True if `mask_type == 0`
`is_instance()`	`-> bool`	True if `mask_type == 1`
`is_panoptic()`	`-> bool`	True if `mask_type == 2`
`data_size()`	`-> usize`	Buffer size for u8 mask (`width * height`)
`data_size_u16()`	`-> usize`	Buffer size for u16 mask (`width * height * 2`)
`with_frame_id(id)`	`(&str) -> Self`	Set coordinate frame, chainable
`with_timestamp(ts)`	`(u64) -> Self`	Set timestamp, chainable
`frame_id()`	`-> &str`	Get frame ID as string

COCO Class Constants

The segmentation::classes module provides standard COCO class IDs:

// simplified
use horus::prelude::*;
use horus_library::messages::segmentation::classes; // Note: not re-exported in prelude

let is_person = pixel_class == classes::PERSON;    // 1
let is_car = pixel_class == classes::CAR;          // 3
let is_dog = pixel_class == classes::DOG;          // 18
let is_background = pixel_class == classes::BACKGROUND; // 0

Common Patterns

Segmentation pipeline:

Segmentation pipeline

Driveable surface detection:

// simplified
use horus::prelude::*;

fn driveable_area(mask_data: &[u8], width: u32, road_class: u8) -> f32 {
    let total = mask_data.len() as f32;
    let road_pixels = mask_data.iter().filter(|&&p| p == road_class).count() as f32;
    road_pixels / total
}

Instance counting:

// simplified
use horus::prelude::*;

fn count_instances(mask_data: &[u8]) -> usize {
    let mut seen = [false; 256];
    for &id in mask_data {
        if id > 0 { // Skip background
            seen[id as usize] = true;
        }
    }
    seen.iter().filter(|&&v| v).count()
}

Panoptic encoding: In panoptic mode, use u16 masks (data_size_u16()) to encode both class and instance: encoded = class_id * 256 + instance_id. This supports up to 256 classes with up to 256 instances each.

Design Decisions

Why a fixed-size header with separate mask data instead of one variable-length message? The 64-byte header travels through the zero-copy ring buffer while the mask data (potentially megabytes for high-resolution images) lives in a separate shared memory region. This keeps the ring buffer compact and avoids blocking other messages.

Why u8 per pixel instead of bitfields? A u8 per pixel supports up to 256 classes (semantic) or 256 instances -- sufficient for virtually all segmentation models. Bitfield packing would halve memory usage but add bit-shift overhead on every pixel access, which matters when processing millions of pixels per frame.

Why three explicit modes instead of a generic mask? Semantic, instance, and panoptic segmentation have different downstream processing (class lookup vs instance counting vs combined decoding). Explicit modes let subscribers branch on is_semantic() / is_instance() / is_panoptic() without guessing the encoding convention.

Panoptic encoding convention: class_id * 256 + instance_id in u16 format supports 256 classes with 256 instances each. This matches the COCO panoptic format and avoids the complexity of separate class and instance buffers.