Scheduler: Running Your Nodes

A robot has a camera reading frames, a controller computing motor commands, a safety system watching for collisions, and a logger recording everything. Each of these is a separate node. But who decides which one runs first? Who makes sure the safety check happens before the motor command? Who handles it when the camera takes too long? And who stops all the motors cleanly when you press Ctrl+C?

You could write all of this coordination yourself — loops, threads, timers, signal handlers — but you'd be writing a scheduler. Every robotics team eventually builds one, and most get it wrong. Race conditions, priority inversions, missed deadlines, and unclean shutdowns are the norm.

HORUS gives you a scheduler that handles all of this. You tell it which nodes to run, in what order, and at what speed. It handles the rest: timing, ordering, monitoring, and shutdown.

For the full reference with real-time configuration, watchdog, deadline monitoring, composable builders, and deterministic mode, see Scheduler — Full Reference.

How It Works

What Is the Scheduler?

The scheduler is the engine that runs your nodes. It does three things:

Calls init() on every node — once, at startup. This is where nodes connect to hardware, open files, or set up state.
Calls tick() on every node — repeatedly, in order, at a configurable speed. This is your main logic.
Calls shutdown() on every node — once, when the program exits. This is where nodes stop motors, close connections, and clean up.

You don't write loops or manage threads. You add nodes, configure their order and timing, and let the scheduler handle everything.

The scheduler lifecycle: initialize once, tick repeatedly, shut down once

Basic Usage

// simplified
use horus::prelude::*;

fn main() -> Result<()> {
    let mut scheduler = Scheduler::new();

    // Just add nodes — the scheduler figures out execution order
    // from topic dependencies (send/recv calls)
    scheduler.add(SensorNode::new()?).build()?;
    scheduler.add(ControlNode::new()?).build()?;
    scheduler.add(LoggerNode::new()?).build()?;

    // Run until Ctrl+C
    scheduler.run()?;
    Ok(())
}

You don't need .order(). The scheduler automatically discovers execution order from your topic send()/recv() calls. If SensorNode publishes data that ControlNode subscribes to, SensorNode always runs first. Independent nodes run in parallel. Use .order() only when you need explicit control over nodes that don't share topics.

Execution Order

The scheduler figures out the right execution order automatically from your topics. When a node calls send() on a topic and another calls recv() on the same topic, the scheduler knows the sender must run first.

The scheduler infers order from topic dependencies — no manual .order() needed

The scheduler builds a dependency graph from your topics. If ControlNode subscribes to a topic that SensorNode publishes, ControlNode always runs after SensorNode — automatically. You don't need to set .order() values.

Node	Topics	Scheduler Action
SensorNode	publishes `scan`	Runs first (no dependencies)
ControlNode	subscribes `scan`, publishes `cmd`	Runs after SensorNode
LoggerNode	subscribes `cmd`	Runs after ControlNode

Independent nodes run in parallel. If you have a Camera node and a LiDAR node that publish to different topics with no shared dependencies, the scheduler runs them simultaneously — no configuration needed.

You can still use .order() as a tiebreaker for nodes with no topic relationship, or when you want explicit control over independent nodes. The scheduler provides order-range guidelines:

Range	Category	Examples
0–9	Critical	Emergency stop, safety monitor
10–49	High priority	Sensor readers, actuator controllers
50–99	Normal	Processing, planning, fusion
100–199	Low priority	Logging, telemetry, visualization
200+	Background	Diagnostics, statistics

Setting the Tick Rate

The tick rate is how many times per second the scheduler runs through all nodes. It's measured in Hertz (Hz) — a unit that means "times per second."

What is a Hertz? 1 Hz means once per second. 100 Hz means 100 times per second. 1000 Hz (also written as 1 kHz) means 1000 times per second. A motor controller typically runs at 100–1000 Hz because it needs to adjust the motor thousands of times per second for smooth motion. A logger might only need 1–10 Hz.

The default tick rate is 100 Hz (100 times per second). Change it with .tick_rate():

// simplified
use horus::prelude::*;

let mut scheduler = Scheduler::new()
    .tick_rate(100_u64.hz());  // 100 times per second

The .hz() syntax is HORUS's DurationExt trait — it converts a number into a frequency. Similarly, .ms() creates milliseconds and .us() creates microseconds:

// simplified
100_u64.hz()    // 100 Hz (frequency)
5_u64.ms()      // 5 milliseconds (duration)
200_u64.us()    // 200 microseconds (duration)
1_u64.secs()    // 1 second (duration)

Per-Node Rates

Not every node needs to run at the same speed. A fast sensor might need 1000 Hz while a logger only needs 10 Hz. Set per-node rates:

// simplified
scheduler.add(FastSensor::new()?)
    .order(0)
    .rate(1000_u64.hz())  // This node ticks at 1 kHz
    .build()?;

scheduler.add(SlowLogger::new()?)
    .order(1)
    .rate(10_u64.hz())    // This node ticks at 10 Hz
    .build()?;

The scheduler automatically skips ticks for slower nodes — SlowLogger only has its tick() called every 100th cycle (at 1 kHz global rate, 10 Hz node rate = called every 100 ticks).

Setting .rate() on a node has a side effect: it automatically marks the node as real-time and derives a budget (80% of the period) and deadline (95% of the period). A 1000 Hz node gets a 0.8 ms budget and a 0.95 ms deadline. You can override these with .budget() and .deadline().

Graceful Shutdown

When you press Ctrl+C, the scheduler doesn't just kill everything. It:

Stops calling tick() on all nodes
Calls shutdown() on every node in reverse order — the last-added node shuts down first
Exits cleanly

This reverse order is critical for safety. Consider: your motor controller (order 1) depends on sensor data from the sensor node (order 0). During shutdown, you want the motor controller to stop the motors before the sensor node disconnects — otherwise the motor controller loses its data source while motors are still spinning.

// simplified
impl Node for MotorController {
    fn name(&self) -> &str { "Motor" }

    fn tick(&mut self) {
        if let Some(cmd) = self.commands.recv() {
            self.motor.set_velocity(cmd.linear);
        }
    }

    // SAFETY: always stop motors in shutdown — a spinning motor
    // with no controller is a safety hazard
    fn shutdown(&mut self) -> Result<()> {
        self.motor.set_velocity(0.0);
        println!("Motor safely stopped");
        Ok(())
    }
}

Always implement shutdown() for nodes that control physical hardware. Without it, motors keep spinning, grippers stay clamped, and heaters stay on when your program exits. The scheduler guarantees shutdown() is called even on Ctrl+C — but only if you implement it.

Timing: Budgets, Deadlines, and Misses

Robots operate in the real world. A motor controller that takes 5 ms instead of 1 ms doesn't just "slow down" — it causes the robot to overshoot its target, potentially colliding with objects or people. The scheduler monitors timing and takes action when nodes run too long.

What Is a Budget?

A budget is the maximum time a node's tick() should take. If you set a budget of 800 µs (microseconds), the scheduler expects tick() to finish within 800 µs. If it takes longer, that's a deadline miss.

// simplified
scheduler.add(MotorController::new()?)
    .order(1)
    .rate(1000_u64.hz())        // 1 kHz → 1 ms per tick
    .budget(800_u64.us())       // tick() should finish in 800 µs
    .build()?;

If you set .rate() without an explicit .budget(), HORUS auto-derives the budget as 80% of the period. A 1000 Hz node (1 ms period) gets a 0.8 ms (800 µs) budget. This leaves 20% headroom for scheduling overhead.

What Is a Deadline?

A deadline is the absolute latest a tick() can finish before the scheduler considers it a critical problem. The budget is a soft target; the deadline is a hard wall.

// simplified
scheduler.add(MotorController::new()?)
    .order(1)
    .rate(1000_u64.hz())
    .budget(800_u64.us())       // Soft target: finish in 800 µs
    .deadline(950_u64.us())     // Hard wall: must finish by 950 µs
    .build()?;

If you set .rate() without an explicit .deadline(), HORUS auto-derives it as 95% of the period.

What Happens When a Node Takes Too Long?

When a node exceeds its deadline, the scheduler reacts according to the miss policy you set with .on_miss():

Policy	What happens	Use when
`Miss::Warn`	Log a warning, continue normally	Non-critical nodes (logging, display)
`Miss::Skip`	Skip this node's next tick to recover	High-frequency nodes that can afford to skip one cycle
`Miss::SafeMode`	Call `enter_safe_state()` on the node	Safety-critical nodes (motors slow to safe speed)
`Miss::Stop`	Stop the entire scheduler	Last resort — whole system must halt

// simplified
scheduler.add(MotorController::new()?)
    .order(1)
    .rate(1000_u64.hz())
    .budget(800_u64.us())
    .on_miss(Miss::SafeMode)  // If tick takes too long, enter safe state
    .build()?;

When Miss::SafeMode triggers, the scheduler calls enter_safe_state() on your node — a method you implement to bring the node to a known-safe condition:

// simplified
impl Node for MotorController {
    fn name(&self) -> &str { "Motor" }

    fn tick(&mut self) {
        if let Some(cmd) = self.commands.recv() {
            self.motor.set_velocity(cmd.linear);
        }
    }

    // Called by scheduler when a deadline miss triggers SafeMode
    fn enter_safe_state(&mut self) {
        // SAFETY: reduce to safe speed — don't stop completely,
        // as that might cause a sudden jerk
        self.motor.set_velocity(0.0);
    }

    fn is_safe_state(&self) -> bool {
        // Tell the scheduler whether we've reached safe state
        self.motor.velocity().abs() < 0.01
    }
}

If you don't set .on_miss(), the default is Miss::Warn — the scheduler logs a warning but takes no action. For nodes that control physical hardware, always set an explicit miss policy.

Execution Classes (How Nodes Run)

Not all nodes have the same workload. A motor controller needs microsecond-precise timing. A path planner needs heavy CPU computation. A cloud uploader needs network I/O. Running them all the same way wastes resources and creates bottlenecks.

You don't need to memorize this section to get started. Just use .order() and optionally .rate() — the scheduler picks the right execution strategy automatically. Come back here when you have nodes with very different workload patterns.

The scheduler assigns each node an execution class based on how you configure it. You describe what you need, the scheduler figures out how to run it:

What you configure	What the scheduler does	Example use case
Nothing special	Runs sequentially in main loop	Logging, telemetry
`.rate(1000.hz())`	Gives a dedicated real-time thread	Motor control, sensor fusion
`.compute()`	Offloads to a CPU thread pool	Path planning, SLAM
`.on("topic")`	Only wakes when that topic has new data	Emergency stop handler
`.async_io()`	Runs on an async (Tokio) executor	Cloud upload, HTTP, database

// simplified
// Just order — runs in the main loop (simplest)
scheduler.add(logger).order(2).build()?;

// Add rate — scheduler auto-creates a dedicated RT thread
scheduler.add(motor).order(1).rate(1000_u64.hz()).build()?;

// Heavy CPU work — scheduler sends it to the thread pool
scheduler.add(planner).order(1).compute().build()?;

// Only run when emergency data arrives
scheduler.add(estop).order(0).on("emergency.stop").build()?;

The key insight: .rate() is the trigger for real-time behavior. When you say "this node needs to run at 1000 Hz," the scheduler knows it needs a dedicated thread with timing guarantees — you don't have to request that explicitly. For deeper coverage of all five classes, see Execution Classes.

Common Pitfalls

Setting .rate() auto-enables real-time. If you set .rate(100.hz()), the scheduler gives the node a dedicated thread, an auto-derived budget (80% of the period), and an auto-derived deadline (95%). This is usually what you want, but if you just wanted a slower tick without RT overhead, set the tick rate on the scheduler instead: Scheduler::new().tick_rate(100.hz()).

Shutdown order is reversed. Nodes shut down in reverse add-order. If your motor controller (added second) depends on sensor data (added first), the motor controller shuts down first — so it can stop motors while sensor data is still available. If you add them in the wrong order, the sensor disconnects while motors are still running.

tick() must return quickly. Never sleep(), loop forever, or do blocking I/O inside tick(). The scheduler calls tick() thousands of times per second — if it blocks, every other node in the system stalls. Use .async_io() for I/O or .compute() for heavy CPU work.

Topic names must use dots, not slashes. "sensors.camera" works on all platforms. "sensors/camera" fails on macOS (where shm_open doesn't support slashes). See Topic Naming.

A Complete Example

A temperature monitor with sensor, threshold checker, and logger — all coordinated by the scheduler:

// simplified
use horus::prelude::*;

struct Sensor {
    pub_temp: Topic<f32>,
    value: f32,
}

impl Sensor {
    fn new() -> Result<Self> {
        Ok(Self { pub_temp: Topic::new("sensor.temperature")?, value: 20.0 })
    }
}

impl Node for Sensor {
    fn name(&self) -> &str { "Sensor" }

    fn init(&mut self) -> Result<()> {
        println!("Sensor initialized");
        Ok(())
    }

    fn tick(&mut self) {
        self.value += 0.1;
        self.pub_temp.send(self.value);
    }

    fn shutdown(&mut self) -> Result<()> {
        println!("Sensor shut down");
        Ok(())
    }
}

struct Monitor {
    sub_temp: Topic<f32>,
    threshold: f32,
}

impl Monitor {
    fn new(threshold: f32) -> Result<Self> {
        Ok(Self { sub_temp: Topic::new("sensor.temperature")?, threshold })
    }
}

impl Node for Monitor {
    fn name(&self) -> &str { "Monitor" }

    fn tick(&mut self) {
        if let Some(temp) = self.sub_temp.recv() {
            if temp > self.threshold {
                println!("ALERT: {:.1}°C exceeds {:.1}°C threshold!", temp, self.threshold);
            } else {
                println!("OK: {:.1}°C", temp);
            }
        }
    }
}

fn main() -> Result<()> {
    let mut sched = Scheduler::new()
        .tick_rate(1_u64.hz());  // 1 Hz — slow enough to observe

    sched.add(Sensor::new()?)
        .order(0)     // Reads data first
        .build()?;

    sched.add(Monitor::new(25.0)?)
        .order(1)     // Processes data second
        .build()?;

    // Runs until Ctrl+C, then calls shutdown() on each node
    sched.run()
}

Design Decisions

Why a scheduler instead of writing your own loop? A while loop that calls each node is simple — until you need timing, ordering, monitoring, and shutdown. A bare loop doesn't enforce execution order, doesn't measure how long each node takes, doesn't recover from deadline misses, and doesn't guarantee motors stop when the program exits. Every robotics team eventually builds these features. The scheduler gives them to you out of the box and has been tested across thousands of configurations.

Why tick() instead of run()? A run() method gives each node full control — it can loop forever, block on I/O, or ignore shutdown signals. A tick() method gives the scheduler full control: it decides when to call each node, how long to allow, and when to force shutdown. This enables deterministic execution (same order every cycle), deadline monitoring (detect when a node takes too long), and coordinated shutdown (all nodes stop together, in the right order).

Why automatic execution class detection? Most developers don't think in terms of "execution classes." They think "this node needs to run at 1 kHz" or "this node does heavy computation." The scheduler infers the right class from .rate(), .compute(), .on(), and .async_io(), mapping developer intent to the right executor. If you set .rate(1000_u64.hz()), the scheduler knows you need a dedicated real-time thread — you don't have to explicitly request one.

Why reverse-order shutdown? Nodes are typically added in dependency order: sensors before controllers before loggers. Shutting down in reverse means controllers stop motors before sensors disconnect, and loggers record the shutdown events before they themselves stop. This prevents the dangerous situation where a sensor disconnects while a motor controller is still running (the controller would have no data and might hold the last velocity forever).

Trade-offs

Gain	Cost
Deterministic ordering — nodes always run in the same sequence	Must manually specify `.order()` for each node
Automatic timing enforcement — budget/deadline/miss monitoring	Adds ~microsecond of overhead per tick per monitored node
Coordinated shutdown — all nodes stop cleanly on Ctrl+C	Nodes must implement `shutdown()` for hardware cleanup
Auto-detected execution classes — right executor for each workload	Less explicit control (use `.compute()` or `.on()` to override)
tick_rate + per-node rates — flexible frequency management	Nodes must finish `tick()` within their budget