Scheduler Configuration
You need to configure how your robot's nodes execute: which ones get real-time threads, how to handle deadline misses, and what order nodes tick in. This guide covers the full scheduler and node builder API.
When To Use This
- You are moving beyond the defaults and need per-node timing, priority, or failure handling
- You need to assign execution classes (RT, Compute, Event, AsyncIo) to different workloads
- You are configuring a production system with watchdogs, blackbox, or RT requirements
Use Scheduler Concepts instead if you need to understand how the scheduler works before configuring it.
Prerequisites
- Familiarity with Nodes and Scheduler
- Understanding of Execution Classes
Creating a Scheduler
Every scheduler starts with Scheduler::new(). From there you can optionally set global parameters with builder methods before adding nodes:
// simplified
use horus::prelude::*;
fn main() -> Result<()> {
let mut scheduler = Scheduler::new()
.tick_rate(1000_u64.hz()); // Global tick rate (default: 100 Hz)
// ... add nodes ...
scheduler.run()?;
Ok(())
}
Builder Methods
| Method | Description | Default |
|---|---|---|
.tick_rate(freq) | Global scheduler tick rate | 100 Hz |
.deterministic(bool) | Deterministic mode — SimClock, dependency ordering, seeded RNG. See Deterministic Mode | false |
.watchdog(Duration) | Frozen node detection — auto-creates safety monitor | disabled |
.blackbox(size_mb) | BlackBox flight recorder (n MB ring buffer) | disabled |
.max_deadline_misses(n) | Emergency stop after n deadline misses | 100 |
.require_rt() | Hard real-time — panics without RT capabilities | — |
.prefer_rt() | Request RT features (degrades gracefully) | — |
.cores(&[usize]) | Pin scheduler threads to specific CPU cores | all cores |
.verbose(bool) | Enable/disable non-emergency logging | true |
.with_recording() | Enable record/replay | — |
.telemetry(endpoint) | Export telemetry to UDP/file endpoint | disabled |
Adding Nodes
Add nodes with scheduler.add(n), then chain configuration calls, and finalize with .build()?:
// simplified
use horus::prelude::*;
fn main() -> Result<()> {
let mut scheduler = Scheduler::new()
.tick_rate(1000_u64.hz());
// Real-time motor control — runs first every tick
scheduler.add(MotorController::new("arm"))
.order(0)
.rate(1000_u64.hz())
.on_miss(Miss::SafeMode)
.build()?;
// Sensor node — high priority, custom rate
scheduler.add(LidarDriver::new("/dev/lidar0"))
.order(10)
.rate(500_u64.hz())
.build()?;
// Compute-heavy planning — runs on a worker thread
scheduler.add(PathPlanner::new())
.order(50)
.compute()
.build()?;
// Event-driven node — wakes only when the topic has new data
scheduler.add(CollisionChecker::new())
.on("lidar.points")
.build()?;
// Async I/O — network or disk, never blocks the real-time loop
scheduler.add(TelemetryUploader::new())
.order(200)
.async_io()
.rate(10_u64.hz())
.build()?;
scheduler.run()?;
Ok(())
}
Execution Classes
Every node belongs to exactly one execution class. Set it in the builder chain:
| Method | Class | Description |
|---|---|---|
.compute() | Compute | Offloaded to a worker thread pool. Use for planning, SLAM, or ML inference. |
.on(topic) | Event-Driven | Wakes only when the named topic receives new data. |
.async_io() | Async I/O | Runs on an async executor. Use for network, disk, or cloud calls. |
If no execution class is specified, the node defaults to BestEffort. A node is automatically promoted to the RT class when you set .rate(Frequency) (which auto-derives budget at 80% and deadline at 95% of the period).
When to Use Each Class
- RT (auto-detected) — Motor controllers, safety monitors, sensor fusion, anything that must run every tick with bounded latency. Triggered by
.rate(Frequency)on a BestEffort node. .compute()— Path planning, point cloud processing, ML inference. These can take longer than a single tick without blocking RT nodes..on(topic)— Collision detection, event handlers, reactive behaviors. Only runs when there is new data, saving CPU when idle..async_io()— Telemetry upload, log shipping, cloud API calls. Never blocks any real-time or compute work.
What each class means for your robot:
- RT — Your motor controller sends PWM commands every millisecond. Missing one cycle causes the motor to overshoot. This node needs a dedicated RT thread.
- Compute — Your SLAM algorithm takes 50ms to process a lidar scan. If it runs on the RT thread, the motor controller misses 50 deadlines. Compute nodes run on a separate thread pool.
- Event — Your collision detector only needs to run when new lidar data arrives, not every cycle. Event nodes sleep until their topic gets a message.
- AsyncIo — Your telemetry node uploads data to a cloud server. Network calls can take seconds. AsyncIo nodes run on a tokio thread pool so they never block anything.
- BestEffort — Your debug logger. Runs on the main thread when there's time, no timing guarantees.
Per-Node Configuration
Ordering and Timing
| Method | Description |
|---|---|
.order(n) | Execution priority within a tick (lower = runs first) |
.rate(Frequency) | Node-specific tick rate — auto-derives budget (80%) and deadline (95%), auto-marks as RT |
.budget(Duration) | Override auto-derived tick budget (max execution time) |
.deadline(Duration) | Override auto-derived absolute deadline |
.on_miss(Miss) | What to do on deadline miss (Miss::Warn, Miss::Skip, Miss::SafeMode, Miss::Stop) |
RT Configuration
| Method | Description |
|---|---|
.priority(i32) | OS thread priority (SCHED_FIFO 1-99) for this node's RT thread |
.core(usize) | Pin this node's RT thread to a specific CPU core |
.watchdog(Duration) | Per-node watchdog timeout (overrides scheduler global) |
These are only meaningful for RT nodes (nodes with .rate()). They require Linux with CAP_SYS_NICE and degrade gracefully when RT capabilities are unavailable.
// simplified
// Safety-critical node: highest priority, pinned to core 2, tight watchdog
scheduler.add(EmergencyStop::new())
.order(0)
.rate(1000_u64.hz())
.priority(99)
.core(2)
.watchdog(2_u64.ms())
.on_miss(Miss::Stop)
.build()?;
// Logger: long watchdog, async I/O
scheduler.add(Logger::new())
.order(200)
.async_io()
.watchdog(5_u64.secs())
.build()?;
Failure Policy
| Method | Description |
|---|---|
.failure_policy(policy) | Per-node failure handling (see Fault Tolerance) |
.build() | Finalize and register the node (returns Result) |
Order Guidelines
- 0-9: Critical real-time (motor control, safety)
- 10-49: High priority (sensors, fast control loops)
- 50-99: Normal priority (processing, planning)
- 100-199: Low priority (logging, diagnostics)
- 200+: Background (telemetry, non-essential)
Global Configuration with Composable Builders
Compose the builder methods you need for each deployment stage:
// simplified
use horus::prelude::*;
// Development — lightweight, profiling is always-on
let mut scheduler = Scheduler::new()
.tick_rate(1000_u64.hz());
// Production — watchdog + blackbox
let mut scheduler = Scheduler::new()
.watchdog(500_u64.ms())
.blackbox(64)
.tick_rate(1000_u64.hz());
// Hard real-time — panics without RT capabilities
let mut scheduler = Scheduler::new()
.require_rt()
.tick_rate(1000_u64.hz());
// Safety-critical — require_rt + blackbox + strict deadline misses
let mut scheduler = Scheduler::new()
.require_rt()
.watchdog(500_u64.ms())
.blackbox(64)
.tick_rate(1000_u64.hz())
.max_deadline_misses(3);
Execution Modes
HORUS supports sequential and parallel execution. You configure this through Scheduler::new() and per-node execution classes.
Use the default. Scheduler::new() gives you predictable, priority-ordered execution that works for most robots.
| Your Situation | Recommended Setup |
|---|---|
| Learning HORUS | Scheduler::new() with defaults |
| Prototyping | Scheduler::new() |
| Need maximum speed | Scheduler::new() with .compute() on heavy nodes |
| Safety-critical (medical, aerospace) | Scheduler::new().require_rt().tick_rate(1000_u64.hz()) with .rate() |
Don't overthink this. Start with Scheduler::new() and configure per-node execution classes as needed.
Sequential Mode (Default)
Nodes execute one-by-one in priority order — same execution order every tick. Predictable and certification-ready.
| Metric | Value |
|---|---|
| Latency | ~100-500ns per node |
| Predictable | Yes — same order every tick |
| Multi-core | No (single thread) |
| Best For | Safety-critical, certification |
When to use: Medical/surgical robots, systems needing reproducible behavior, debugging timing issues, formal verification.
use horus::prelude::*;
// Safety-critical robot controller — 1 kHz tick
// rate() auto-marks nodes as RT with derived budget + deadline
let mut scheduler = Scheduler::new()
.require_rt()
.watchdog(500_u64.ms())
.blackbox(64)
.tick_rate(1000_u64.hz());
scheduler.add(safety_monitor).order(0).rate(1000_u64.hz()).on_miss(Miss::Stop).build()?;
scheduler.add(controller).order(1).rate(1000_u64.hz()).on_miss(Miss::SafeMode).build()?;
scheduler.run()?;
Parallel Mode
Schedules independent nodes on different CPU cores. Nodes at the same order level run concurrently:
| Metric | Value |
|---|---|
| Latency | Variable (depends on workload) |
| Predictable | Ordering within same priority level varies |
| Multi-core | Yes |
| Best For | Multi-sensor fusion, compute-heavy pipelines |
When to use: Multi-sensor robots, compute-heavy pipelines, systems with many independent nodes.
use horus::prelude::*;
// Research robot with many sensors — parallel sensor processing
let mut scheduler = Scheduler::new();
// These sensor nodes run in parallel (same order number)
scheduler.add(lidar_node).order(0).build()?;
scheduler.add(camera_node).order(0).build()?;
scheduler.add(imu_node).order(0).build()?;
// Fusion runs after all sensors (higher order number)
scheduler.add(fusion_node).order(1).build()?;
scheduler.run()?;
Mode Comparison
| Feature | Sequential | Parallel |
|---|---|---|
| Predictable Order | Yes | Per-priority level |
| Multi-core | No | Yes |
| Best Latency | 87-313ns | Variable |
| Certification Ready | Yes | No |
DurationExt and Frequency
HORUS provides ergonomic extension methods for creating Duration and Frequency values, replacing verbose Duration::from_micros(200) calls:
Duration Helpers
// simplified
use horus::prelude::*;
// Microseconds
let budget = 200_u64.us(); // Duration::from_micros(200)
// Milliseconds
let deadline = 1_u64.ms(); // Duration::from_millis(1)
// Seconds
let timeout = 5_u64.secs(); // Duration::from_secs(5)
Works on u64 literals via the DurationExt trait.
Frequency Type
The .hz() method creates a Frequency that auto-derives timing parameters:
// simplified
use horus::prelude::*;
let freq = 100_u64.hz();
freq.value() // 100.0 Hz
freq.period() // 10ms (1/frequency)
freq.budget_default() // 8ms (80% of period)
freq.deadline_default() // 9.5ms (95% of period)
Use Frequency with the node builder's .rate() method to auto-configure RT timing:
// simplified
// Auto-derives budget (80% period) and deadline (95% period)
// Also auto-marks the node as RT
scheduler.add(motor_ctrl)
.order(0)
.rate(500_u64.hz()) // period=2ms, budget=1.6ms, deadline=1.9ms
.on_miss(Miss::Skip)
.build()?;
| Method | Returns | Description |
|---|---|---|
.us() | Duration | Microseconds |
.ms() | Duration | Milliseconds |
.secs() | Duration | Seconds |
.hz() | Frequency | Frequency in Hz |
freq.value() | f64 | Frequency in Hz |
freq.period() | Duration | 1/frequency |
freq.budget_default() | Duration | 80% of period |
freq.deadline_default() | Duration | 95% of period |
Design Decisions
Why auto-derive budget and deadline from .rate()?
Most developers think in terms of "this node runs at 1kHz" rather than "this node has an 800us budget and 950us deadline." Auto-derivation (budget = 80% period, deadline = 95% period) provides safe defaults without requiring timing expertise. Override with explicit .budget() and .deadline() when profiling shows different requirements.
Why composable builders instead of presets?
Early versions of HORUS had presets like deploy() and hard_rt(). These were removed because real systems need specific combinations of features. Composable builders let you pick exactly what you need: .watchdog(500_u64.ms()).blackbox(64) is clearer than a preset that might enable features you do not want.
Why .order() instead of automatic dependency ordering?
Explicit ordering is predictable and debuggable. Automatic dependency ordering (available in deterministic mode) requires publishers() and subscribers() metadata on every node. In normal mode, .order() gives you full control without metadata overhead.
Trade-offs
| Gain | Cost |
|---|---|
| Per-node execution classes match workload to executor | More configuration decisions when adding nodes |
Auto-derived timing from .rate() reduces configuration | 80%/95% defaults may not match your workload profile |
| Composable builders allow precise feature selection | No single-line "production mode" shortcut |
Explicit .order() is predictable | Must be maintained manually as nodes are added |
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
| Node runs as BestEffort when you expected RT | .rate() not set, or .compute() overrides it | Set .rate(freq) and do not combine with .compute() |
| "Cannot set SCHED_FIFO" at startup | Missing RT permissions | See RT Setup for limits.conf and setcap |
| Deadline misses on every tick | Budget too tight for actual computation time | Profile with horus monitor, then increase .budget() or lower .rate() |
| Node never ticks | .on(topic) set but no publisher on that topic | Verify another node publishes to the same topic name |
.build() returns error | Conflicting configuration (e.g., .on() with .budget()) | Event nodes cannot have budgets. Remove timing constraints from event nodes |
| Nodes execute in wrong order | .order() values not set or identical | Assign distinct .order() values. Lower = runs first |
See Also
- Scheduler Concepts — How the scheduler works
- Execution Classes — The 5 execution classes and when to use each
- Safety Monitor — Watchdog and deadline enforcement
- Fault Tolerance — Failure policies and recovery
- RT Setup — Linux real-time kernel configuration