Advanced Topics

Production-readiness features for HORUS: real-time scheduling, safety monitoring, failure recovery, and data recording. These guides solve specific problems you encounter when moving from prototype to deployment.

These topics assume familiarity with Core Concepts — especially Nodes, Scheduler, and Execution Classes.

Quick Reference

I need to...Read this
Tune tick rates, thread pools, and execution classesScheduler Configuration
Get reproducible execution for simulation or testingDeterministic Mode
Configure Linux for hard real-time schedulingRT Setup
Monitor node health and enforce timing deadlinesSafety Monitor
Handle node failures without crashing the systemFault Tolerance
Record events for crash forensicsBlackBox Recorder
Record and replay full sessions for debuggingRecord & Replay
Understand communication backends and latencyNetwork Backends

Scheduling & Timing

  • Scheduler Configuration — Tick rates, execution classes, per-node timing, and priority ordering
  • Deterministic Mode — Reproducible execution with SimClock, dependency ordering, and seeded RNG
  • RT Setup — Linux real-time kernel, SCHED_FIFO, CPU isolation, and PREEMPT_RT

Safety & Reliability

  • Safety Monitor — Watchdog timers, budget enforcement, deadline miss policies, and graduated degradation
  • Fault Tolerance — Per-node failure policies (Fatal, Restart, Skip, Ignore) for preventing cascading failures

Data Recording

  • BlackBox Recorder — Flight recorder for post-crash analysis with ring buffer and CLI tools
  • Record & Replay — Session recording for debugging, regression testing, and mixed replay

Infrastructure

  • Network Backends — Automatic backend selection, shared memory IPC, and planned network transport

See Also