Execution Classes
A motor controller that misses its deadline by even a millisecond can cause a robot arm to overshoot and collide with a person. A path planner that takes 50 ms of CPU time is completely normal — but if it runs on the same thread as the motor controller, it blocks 50 ticks. A logging node that takes an extra 10 ms is harmless. A cloud uploader that blocks on a network request shouldn't hold up anything.
These are fundamentally different workloads. Running them all the same way — in a single sequential loop — forces every node to compromise. The fast ones wait for the slow ones. The critical ones share a thread with the optional ones. A single slow node can cascade timing failures across the entire system.
HORUS solves this with execution classes: five different executors, each optimized for a specific workload type. The scheduler automatically selects the right class based on how you configure the node — you describe what your node needs, and the scheduler figures out how to run it.
The Five Classes
BestEffort (Default)
Nodes tick sequentially in the main loop, ordered by .order(). This is the simplest and lowest-overhead class.
// simplified
sched.add(display_node)
.order(100)
.build()?; // No .rate(), .compute(), .on(), or .async_io() → BestEffort
How it works: The main scheduler thread calls tick() on each BestEffort node in sequence, once per scheduler cycle. No threads are spawned. No synchronization is needed.
Use for: Logging, telemetry, display, diagnostic reporting — anything without timing requirements or heavy computation.
Characteristics:
- Runs at the scheduler's global
tick_rate() - Deterministic ordering (same sequence every cycle)
- Lowest overhead — no thread spawn, no atomics, no synchronization
Rt (Real-Time)
Each RT node gets a dedicated thread with optional OS-level priority scheduling. The scheduler enforces timing budgets and deadlines, and takes action when a node runs too long.
// simplified
// Auto-derived: budget = 80% of period, deadline = 95% of period
sched.add(motor_ctrl)
.order(0)
.rate(1000_u64.hz()) // 1 kHz → budget=800µs, deadline=950µs
.on_miss(Miss::SafeMode) // Enter safe state on deadline miss
.build()?;
// Explicit budget and deadline
sched.add(safety_monitor)
.order(0)
.budget(100_u64.us())
.deadline(200_u64.us())
.on_miss(Miss::Stop)
.build()?;
There is no .rt() method. RT is always auto-detected from timing constraints. Setting .rate(), .budget(), or .deadline() — any of them — automatically assigns the Rt class. This maps developer intent ("this node needs to run at 1 kHz") directly to the right executor.
How it works: Each RT node runs on its own dedicated thread. If .require_rt() or .prefer_rt() is set on the scheduler and the OS supports it, the thread gets SCHED_FIFO real-time priority. The scheduler measures every tick() call and applies the Miss policy when budget or deadline is exceeded.
Additional RT configuration:
// simplified
sched.add(critical_node)
.order(0)
.rate(1000_u64.hz())
.budget(300_u64.us())
.deadline(900_u64.us())
.on_miss(Miss::Skip)
.priority(90) // OS-level thread priority (1-99, higher = more urgent)
.core(0) // Pin to CPU core 0
.watchdog(500_u64.ms()) // Per-node freeze detection
.build()?;
Use for: Motor control, safety monitoring, sensor fusion — anything where missing a deadline has physical consequences.
Compute
For CPU-heavy work that benefits from parallelism. Multiple Compute nodes run simultaneously on a shared thread pool.
// simplified
sched.add(path_planner)
.order(5)
.compute()
.rate(10_u64.hz()) // Optional: limit how often the node ticks
.build()?;
How it works: Compute nodes are dispatched to a thread pool (similar to rayon). Multiple compute nodes can run in parallel on different CPU cores. They don't block the main tick loop or RT threads.
Use for: Path planning, SLAM, point cloud processing, ML inference on CPU, image processing — any CPU-bound work that takes more than ~1 ms per tick.
.rate() on a Compute node does not make it RT — it only limits how often the node ticks. A .rate(10_u64.hz()).compute() node ticks at most 10 times per second but has no budget or deadline enforcement.
Event
Nodes that sleep until a specific topic receives new data. Zero CPU usage when idle.
// simplified
sched.add(estop_handler)
.order(0)
.on("emergency.stop")
.build()?;
How it works: The node's thread sleeps. When any publisher calls send() on the named topic, the Event node wakes and tick() is called. If multiple messages arrive between wakes, the node ticks once — call recv() in a loop inside tick() to drain all pending messages.
Use for: Emergency stop handlers, command receivers, sparse event processors — anything where the node should be completely idle until something specific happens.
Characteristics:
- Zero CPU when no messages arrive
- Wake latency: ~microseconds from
send()totick() .order()still applies: if two event nodes wake simultaneously, lower order runs first- The topic name in
.on("name")must match aTopic::new("name")in another node
AsyncIo
For network or file I/O operations that would block a real-time thread. Runs on a tokio runtime.
// simplified
sched.add(cloud_uploader)
.order(50)
.async_io()
.rate(1_u64.hz()) // Upload once per second
.build()?;
How it works: The node's tick() runs via tokio::task::spawn_blocking on a tokio-managed thread pool. The node can safely block on network requests, file I/O, or database queries without affecting any other node.
Use for: HTTP/REST API calls, database writes, file logging, cloud telemetry.
Gpu
For nodes that launch CUDA kernels. The scheduler manages a dedicated GPU thread with one CUDA stream per node. Kernels launched in tick() execute asynchronously on the GPU; the executor synchronizes after all GPU nodes have launched.
// simplified
sched.add(preprocess_node)
.order(2)
.gpu()
.rate(30_u64.hz())
.build()?;
How it works: The GpuExecutor runs on a dedicated thread with a CUDA context. Each GPU node gets its own CUDA stream. Per tick cycle: (1) all ready nodes call tick() which launches GPU kernels non-blocking, (2) all streams synchronize, (3) GPU-side timing is recorded via CUDA events. Inside tick(), access the scheduler-managed stream via horus::gpu_stream().
Use for: Image preprocessing (resize, normalize, color convert), neural network inference, point cloud filtering, any CUDA kernel work. .gpu() takes precedence over .rate() for execution class -- a GPU node with .rate(30.hz()).gpu() runs on the GPU executor at 30Hz, not the RT executor.
Graceful degradation: If CUDA is not available, the GpuExecutor logs a warning and GPU nodes fall back to BestEffort execution on the main thread.
How Classes Are Selected
The scheduler selects the execution class based on which builder methods you call:
| Configuration | Resulting Class |
|---|---|
| (nothing special) | BestEffort |
.rate() | Rt (auto-derived budget/deadline) |
.budget() | Rt |
.deadline() | Rt |
.rate() + .budget() + .deadline() | Rt (explicit overrides) |
.compute() | Compute |
.compute().rate() | Compute (rate-limited, not RT) |
.on("topic") | Event |
.async_io() | AsyncIo |
.async_io().rate() | AsyncIo (rate-limited, not RT) |
.gpu() | Gpu |
.gpu().rate() | Gpu (rate-limited, CUDA stream managed) |
Key rule: .rate() only auto-enables RT when no explicit execution class (.compute(), .on(), .async_io(), .gpu()) is set. When combined with an explicit class, .rate() just limits tick frequency.
Deferred Finalization
Class selection happens at .build() time, not when individual methods are called. This means:
.rate(100_u64.hz()).compute()→ Compute (.compute()overrides auto-RT).compute().rate(100_u64.hz())→ Compute (same result regardless of order).compute().async_io()→ AsyncIo (last explicit class wins, warning logged)
Decision Guide
| Your node does... | Use | Builder |
|---|---|---|
| Motor control at 500+ Hz | Rt | .rate(500_u64.hz()) |
| Safety monitoring with deadlines | Rt | .rate().budget().deadline().on_miss() |
| Path planning (takes 10-50 ms) | Compute | .compute() |
| ML inference on CPU | Compute | .compute() |
| React to emergency stop | Event | .on("emergency.stop") |
| Process commands as they arrive | Event | .on("command") |
| Upload telemetry to cloud | AsyncIo | .async_io() |
| Write logs to disk | AsyncIo | .async_io() |
| Display dashboard updates | BestEffort | (default) |
| Simple sensor reading | BestEffort or Rt | .rate() if timing matters |
Validation and Common Mistakes
.build() validates your configuration and catches mistakes:
What's Rejected
| Configuration | Error |
|---|---|
.compute().budget() | Budget only meaningful for RT nodes |
.on("topic").deadline() | Deadline only meaningful for RT nodes |
.async_io().budget() | Budget only meaningful for RT nodes |
.budget(Duration::ZERO) | Budget must be > 0 |
.on("") | Empty topic — node can never trigger |
What's Warned
| Configuration | Warning |
|---|---|
.compute().async_io() | Last class wins (AsyncIo), first silently overridden |
.compute().priority(99) | Priority ignored on non-RT nodes |
.on_miss(Miss::Stop) without deadline | No deadline to miss — policy has no effect |
Common Mistakes
1. Thinking .priority() works on Compute nodes
// simplified
// Priority is silently ignored — only RT nodes get SCHED_FIFO threads
sched.add(planner).compute().priority(99).build()?;
// Make it RT if you need OS-level priority
sched.add(planner).rate(100_u64.hz()).priority(99).build()?;
2. Setting .on_miss() without a deadline
// simplified
// No deadline means Miss::Stop can never trigger
sched.add(ctrl).compute().on_miss(Miss::Stop).build()?;
// Add .rate() so a deadline exists to miss
sched.add(ctrl).rate(100_u64.hz()).on_miss(Miss::Stop).build()?;
3. Chaining multiple execution classes
// simplified
// Only the LAST class applies — compute() is silently overridden
sched.add(node).compute().async_io().build()?; // → AsyncIo, NOT Compute
// Pick one
sched.add(node).async_io().build()?;
Complete Example: Mixed Execution Classes
// simplified
use horus::prelude::*;
fn main() -> Result<()> {
let mut sched = Scheduler::new()
.tick_rate(100_u64.hz())
.prefer_rt();
// Rt — 1 kHz motor control with strict timing
sched.add(MotorController::new()?)
.order(0)
.rate(1000_u64.hz())
.budget(300_u64.us())
.on_miss(Miss::Skip)
.build()?;
// Event — only runs when emergency_stop topic updates
sched.add(EmergencyHandler::new()?)
.order(0)
.on("emergency.stop")
.build()?;
// Rt — 100 Hz sensor reading
sched.add(ImuReader::new()?)
.order(1)
.rate(100_u64.hz())
.build()?;
// Compute — path planning in parallel
sched.add(PathPlanner::new()?)
.order(5)
.compute()
.rate(10_u64.hz())
.build()?;
// AsyncIo — telemetry upload every 5 seconds
sched.add(TelemetryUploader::new()?)
.order(50)
.async_io()
.rate(0.2_f64.hz())
.build()?;
// BestEffort — display node in main loop
sched.add(Dashboard::new()?)
.order(100)
.build()?;
sched.run()
}
Design Decisions
Why 5 classes instead of just RT and non-RT? A thread-per-node model (RT) is wasteful for logging nodes — dedicating OS threads and SCHED_FIFO slots to telemetry is overkill. A single-threaded model (BestEffort) can't handle 50 ms path planning without stalling the control loop. A two-class split (RT vs non-RT) doesn't distinguish between CPU-bound work (Compute), event-driven reactions (Event), and I/O-bound operations (AsyncIo) — each of which has a fundamentally different optimal executor. The five-class model matches the five common robotics workload patterns.
Why auto-detection instead of explicit class selection?
Most developers don't think in terms of "execution classes" — they think "this node needs to run at 1 kHz" or "this node does heavy computation." Auto-detection from .rate(), .compute(), .on(), and .async_io() maps intent to the right executor without requiring framework knowledge. If you set .rate(1000_u64.hz()), the scheduler knows you need a dedicated real-time thread. You don't have to explicitly request one.
Why does .rate() + .compute() not become RT?
Because rate-limiting and real-time are different things. A path planner at 10 Hz means "tick at most 10 times per second" — not "this node has a 100 ms deadline that must be enforced." Mixing the two concepts would force compute nodes to pay RT overhead (dedicated threads, timing measurement) for no benefit. The rule is clear: .rate() only triggers RT when no explicit class is set.
Trade-offs
| Gain | Cost |
|---|---|
| Right executor per workload — each node runs optimally | Must understand which class fits your node |
Auto-detection — .rate() infers RT without explicit configuration | Less explicit — must know the .rate() + .compute() interaction |
| RT isolation — a slow Compute node can't block an RT motor controller | RT nodes consume one OS thread each |
| Event nodes — zero CPU when idle | Must match .on("topic") name exactly to a Topic::new("topic") |
| AsyncIo — I/O never blocks the tick loop | tokio runtime overhead for simple file writes |
See Also
- Builder Composition Guide — How builder methods interact, override, and compose
- Scheduler — Full Reference — How the scheduler manages all execution classes
- Scheduler API — Complete builder method reference
- Scheduler Configuration — Advanced tuning and RT setup
- Choosing Configuration — Progressive complexity guide
- Real-Time Control Tutorial — Hands-on RT tutorial