Scheduler Configuration

You need to configure how your robot's nodes execute: which ones get real-time threads, how to handle deadline misses, and what order nodes tick in. This guide covers the full scheduler and node builder API.

When To Use This

  • You are moving beyond the defaults and need per-node timing, priority, or failure handling
  • You need to assign execution classes (RT, Compute, Event, AsyncIo) to different workloads
  • You are configuring a production system with watchdogs, blackbox, or RT requirements

Use Scheduler Concepts instead if you need to understand how the scheduler works before configuring it.

Prerequisites

Creating a Scheduler

Every scheduler starts with Scheduler::new(). From there you can optionally set global parameters with builder methods before adding nodes:

// simplified
use horus::prelude::*;

fn main() -> Result<()> {
    let mut scheduler = Scheduler::new()
        .tick_rate(1000_u64.hz());     // Global tick rate (default: 100 Hz)

    // ... add nodes ...

    scheduler.run()?;
    Ok(())
}

Builder Methods

MethodDescriptionDefault
.tick_rate(freq)Global scheduler tick rate100 Hz
.deterministic(bool)Deterministic mode — SimClock, dependency ordering, seeded RNG. See Deterministic Modefalse
.watchdog(Duration)Frozen node detection — auto-creates safety monitordisabled
.blackbox(size_mb)BlackBox flight recorder (n MB ring buffer)disabled
.max_deadline_misses(n)Emergency stop after n deadline misses100
.require_rt()Hard real-time — panics without RT capabilities
.prefer_rt()Request RT features (degrades gracefully)
.cores(&[usize])Pin scheduler threads to specific CPU coresall cores
.verbose(bool)Enable/disable non-emergency loggingtrue
.with_recording()Enable record/replay
.telemetry(endpoint)Export telemetry to UDP/file endpointdisabled

Adding Nodes

Add nodes with scheduler.add(n), then chain configuration calls, and finalize with .build()?:

// simplified
use horus::prelude::*;

fn main() -> Result<()> {
    let mut scheduler = Scheduler::new()
        .tick_rate(1000_u64.hz());

    // Real-time motor control — runs first every tick
    scheduler.add(MotorController::new("arm"))
        .order(0)
        .rate(1000_u64.hz())
        .on_miss(Miss::SafeMode)
        .build()?;

    // Sensor node — high priority, custom rate
    scheduler.add(LidarDriver::new("/dev/lidar0"))
        .order(10)
        .rate(500_u64.hz())
        .build()?;

    // Compute-heavy planning — runs on a worker thread
    scheduler.add(PathPlanner::new())
        .order(50)
        .compute()
        .build()?;

    // Event-driven node — wakes only when the topic has new data
    scheduler.add(CollisionChecker::new())
        .on("lidar.points")
        .build()?;

    // Async I/O — network or disk, never blocks the real-time loop
    scheduler.add(TelemetryUploader::new())
        .order(200)
        .async_io()
        .rate(10_u64.hz())
        .build()?;

    scheduler.run()?;
    Ok(())
}

Execution Classes

Every node belongs to exactly one execution class. Set it in the builder chain:

MethodClassDescription
.compute()ComputeOffloaded to a worker thread pool. Use for planning, SLAM, or ML inference.
.on(topic)Event-DrivenWakes only when the named topic receives new data.
.async_io()Async I/ORuns on an async executor. Use for network, disk, or cloud calls.

If no execution class is specified, the node defaults to BestEffort. A node is automatically promoted to the RT class when you set .rate(Frequency) (which auto-derives budget at 80% and deadline at 95% of the period).

When to Use Each Class

  • RT (auto-detected) — Motor controllers, safety monitors, sensor fusion, anything that must run every tick with bounded latency. Triggered by .rate(Frequency) on a BestEffort node.
  • .compute() — Path planning, point cloud processing, ML inference. These can take longer than a single tick without blocking RT nodes.
  • .on(topic) — Collision detection, event handlers, reactive behaviors. Only runs when there is new data, saving CPU when idle.
  • .async_io() — Telemetry upload, log shipping, cloud API calls. Never blocks any real-time or compute work.

What each class means for your robot:

  • RT — Your motor controller sends PWM commands every millisecond. Missing one cycle causes the motor to overshoot. This node needs a dedicated RT thread.
  • Compute — Your SLAM algorithm takes 50ms to process a lidar scan. If it runs on the RT thread, the motor controller misses 50 deadlines. Compute nodes run on a separate thread pool.
  • Event — Your collision detector only needs to run when new lidar data arrives, not every cycle. Event nodes sleep until their topic gets a message.
  • AsyncIo — Your telemetry node uploads data to a cloud server. Network calls can take seconds. AsyncIo nodes run on a tokio thread pool so they never block anything.
  • BestEffort — Your debug logger. Runs on the main thread when there's time, no timing guarantees.

Per-Node Configuration

Ordering and Timing

MethodDescription
.order(n)Execution priority within a tick (lower = runs first)
.rate(Frequency)Node-specific tick rate — auto-derives budget (80%) and deadline (95%), auto-marks as RT
.budget(Duration)Override auto-derived tick budget (max execution time)
.deadline(Duration)Override auto-derived absolute deadline
.on_miss(Miss)What to do on deadline miss (Miss::Warn, Miss::Skip, Miss::SafeMode, Miss::Stop)

RT Configuration

MethodDescription
.priority(i32)OS thread priority (SCHED_FIFO 1-99) for this node's RT thread
.core(usize)Pin this node's RT thread to a specific CPU core
.watchdog(Duration)Per-node watchdog timeout (overrides scheduler global)

These are only meaningful for RT nodes (nodes with .rate()). They require Linux with CAP_SYS_NICE and degrade gracefully when RT capabilities are unavailable.

// simplified
// Safety-critical node: highest priority, pinned to core 2, tight watchdog
scheduler.add(EmergencyStop::new())
    .order(0)
    .rate(1000_u64.hz())
    .priority(99)
    .core(2)
    .watchdog(2_u64.ms())
    .on_miss(Miss::Stop)
    .build()?;

// Logger: long watchdog, async I/O
scheduler.add(Logger::new())
    .order(200)
    .async_io()
    .watchdog(5_u64.secs())
    .build()?;

Failure Policy

MethodDescription
.failure_policy(policy)Per-node failure handling (see Fault Tolerance)
.build()Finalize and register the node (returns Result)

Order Guidelines

  • 0-9: Critical real-time (motor control, safety)
  • 10-49: High priority (sensors, fast control loops)
  • 50-99: Normal priority (processing, planning)
  • 100-199: Low priority (logging, diagnostics)
  • 200+: Background (telemetry, non-essential)

Global Configuration with Composable Builders

Compose the builder methods you need for each deployment stage:

// simplified
use horus::prelude::*;

// Development — lightweight, profiling is always-on
let mut scheduler = Scheduler::new()
    .tick_rate(1000_u64.hz());

// Production — watchdog + blackbox
let mut scheduler = Scheduler::new()
    .watchdog(500_u64.ms())
    .blackbox(64)
    .tick_rate(1000_u64.hz());

// Hard real-time — panics without RT capabilities
let mut scheduler = Scheduler::new()
    .require_rt()
    .tick_rate(1000_u64.hz());

// Safety-critical — require_rt + blackbox + strict deadline misses
let mut scheduler = Scheduler::new()
    .require_rt()
    .watchdog(500_u64.ms())
    .blackbox(64)
    .tick_rate(1000_u64.hz())
    .max_deadline_misses(3);

Execution Modes

HORUS supports sequential and parallel execution. You configure this through Scheduler::new() and per-node execution classes.

ℹ️Quick Answer: Which Mode Should I Use?

Use the default. Scheduler::new() gives you predictable, priority-ordered execution that works for most robots.

Your SituationRecommended Setup
Learning HORUSScheduler::new() with defaults
PrototypingScheduler::new()
Need maximum speedScheduler::new() with .compute() on heavy nodes
Safety-critical (medical, aerospace)Scheduler::new().require_rt().tick_rate(1000_u64.hz()) with .rate()

Don't overthink this. Start with Scheduler::new() and configure per-node execution classes as needed.

Sequential Mode (Default)

Nodes execute one-by-one in priority order — same execution order every tick. Predictable and certification-ready.

MetricValue
Latency~100-500ns per node
PredictableYes — same order every tick
Multi-coreNo (single thread)
Best ForSafety-critical, certification

When to use: Medical/surgical robots, systems needing reproducible behavior, debugging timing issues, formal verification.

use horus::prelude::*;

// Safety-critical robot controller — 1 kHz tick
// rate() auto-marks nodes as RT with derived budget + deadline
let mut scheduler = Scheduler::new()
    .require_rt()
    .watchdog(500_u64.ms())
    .blackbox(64)
    .tick_rate(1000_u64.hz());
scheduler.add(safety_monitor).order(0).rate(1000_u64.hz()).on_miss(Miss::Stop).build()?;
scheduler.add(controller).order(1).rate(1000_u64.hz()).on_miss(Miss::SafeMode).build()?;
scheduler.run()?;

Parallel Mode

Schedules independent nodes on different CPU cores. Nodes at the same order level run concurrently:

MetricValue
LatencyVariable (depends on workload)
PredictableOrdering within same priority level varies
Multi-coreYes
Best ForMulti-sensor fusion, compute-heavy pipelines

When to use: Multi-sensor robots, compute-heavy pipelines, systems with many independent nodes.

use horus::prelude::*;

// Research robot with many sensors — parallel sensor processing
let mut scheduler = Scheduler::new();

// These sensor nodes run in parallel (same order number)
scheduler.add(lidar_node).order(0).build()?;
scheduler.add(camera_node).order(0).build()?;
scheduler.add(imu_node).order(0).build()?;

// Fusion runs after all sensors (higher order number)
scheduler.add(fusion_node).order(1).build()?;
scheduler.run()?;

Mode Comparison

FeatureSequentialParallel
Predictable OrderYesPer-priority level
Multi-coreNoYes
Best Latency87-313nsVariable
Certification ReadyYesNo

DurationExt and Frequency

HORUS provides ergonomic extension methods for creating Duration and Frequency values, replacing verbose Duration::from_micros(200) calls:

Duration Helpers

// simplified
use horus::prelude::*;

// Microseconds
let budget = 200_u64.us();     // Duration::from_micros(200)

// Milliseconds
let deadline = 1_u64.ms();     // Duration::from_millis(1)

// Seconds
let timeout = 5_u64.secs();    // Duration::from_secs(5)

Works on u64 literals via the DurationExt trait.

Frequency Type

The .hz() method creates a Frequency that auto-derives timing parameters:

// simplified
use horus::prelude::*;

let freq = 100_u64.hz();

freq.value()            // 100.0 Hz
freq.period()           // 10ms (1/frequency)
freq.budget_default()   // 8ms  (80% of period)
freq.deadline_default() // 9.5ms (95% of period)

Use Frequency with the node builder's .rate() method to auto-configure RT timing:

// simplified
// Auto-derives budget (80% period) and deadline (95% period)
// Also auto-marks the node as RT
scheduler.add(motor_ctrl)
    .order(0)
    .rate(500_u64.hz())   // period=2ms, budget=1.6ms, deadline=1.9ms
    .on_miss(Miss::Skip)
    .build()?;
MethodReturnsDescription
.us()DurationMicroseconds
.ms()DurationMilliseconds
.secs()DurationSeconds
.hz()FrequencyFrequency in Hz
freq.value()f64Frequency in Hz
freq.period()Duration1/frequency
freq.budget_default()Duration80% of period
freq.deadline_default()Duration95% of period

Design Decisions

Why auto-derive budget and deadline from .rate()?

Most developers think in terms of "this node runs at 1kHz" rather than "this node has an 800us budget and 950us deadline." Auto-derivation (budget = 80% period, deadline = 95% period) provides safe defaults without requiring timing expertise. Override with explicit .budget() and .deadline() when profiling shows different requirements.

Why composable builders instead of presets?

Early versions of HORUS had presets like deploy() and hard_rt(). These were removed because real systems need specific combinations of features. Composable builders let you pick exactly what you need: .watchdog(500_u64.ms()).blackbox(64) is clearer than a preset that might enable features you do not want.

Why .order() instead of automatic dependency ordering?

Explicit ordering is predictable and debuggable. Automatic dependency ordering (available in deterministic mode) requires publishers() and subscribers() metadata on every node. In normal mode, .order() gives you full control without metadata overhead.

Trade-offs

GainCost
Per-node execution classes match workload to executorMore configuration decisions when adding nodes
Auto-derived timing from .rate() reduces configuration80%/95% defaults may not match your workload profile
Composable builders allow precise feature selectionNo single-line "production mode" shortcut
Explicit .order() is predictableMust be maintained manually as nodes are added

Common Errors

SymptomCauseFix
Node runs as BestEffort when you expected RT.rate() not set, or .compute() overrides itSet .rate(freq) and do not combine with .compute()
"Cannot set SCHED_FIFO" at startupMissing RT permissionsSee RT Setup for limits.conf and setcap
Deadline misses on every tickBudget too tight for actual computation timeProfile with horus monitor, then increase .budget() or lower .rate()
Node never ticks.on(topic) set but no publisher on that topicVerify another node publishes to the same topic name
.build() returns errorConflicting configuration (e.g., .on() with .budget())Event nodes cannot have budgets. Remove timing constraints from event nodes
Nodes execute in wrong order.order() values not set or identicalAssign distinct .order() values. Lower = runs first

See Also