Advanced Topics
Production-readiness features for HORUS: real-time scheduling, safety monitoring, failure recovery, and data recording. These guides solve specific problems you encounter when moving from prototype to deployment.
These topics assume familiarity with Core Concepts — especially Nodes, Scheduler, and Execution Classes.
Quick Reference
| I need to... | Read this |
|---|---|
| Tune tick rates, thread pools, and execution classes | Scheduler Configuration |
| Get reproducible execution for simulation or testing | Deterministic Mode |
| Configure Linux for hard real-time scheduling | RT Setup |
| Monitor node health and enforce timing deadlines | Safety Monitor |
| Handle node failures without crashing the system | Fault Tolerance |
| Record events for crash forensics | BlackBox Recorder |
| Record and replay full sessions for debugging | Record & Replay |
| Understand communication backends and latency | Network Backends |
Scheduling & Timing
- Scheduler Configuration — Tick rates, execution classes, per-node timing, and priority ordering
- Deterministic Mode — Reproducible execution with SimClock, dependency ordering, and seeded RNG
- RT Setup — Linux real-time kernel, SCHED_FIFO, CPU isolation, and PREEMPT_RT
Safety & Reliability
- Safety Monitor — Watchdog timers, budget enforcement, deadline miss policies, and graduated degradation
- Fault Tolerance — Per-node failure policies (Fatal, Restart, Skip, Ignore) for preventing cascading failures
Data Recording
- BlackBox Recorder — Flight recorder for post-crash analysis with ring buffer and CLI tools
- Record & Replay — Session recording for debugging, regression testing, and mixed replay
Infrastructure
- Network Backends — Automatic backend selection, shared memory IPC, and planned network transport
See Also
- Core Concepts — Prerequisite knowledge for all advanced topics
- Recipes — Practical code patterns that use these features
- Rust API Reference — Exact method signatures and parameters
- Performance — Optimization and benchmarks