Multi-Process Architecture

HORUS topics work transparently across process boundaries. Two nodes in separate processes communicate the same way as two nodes in the same process — through shared memory. No broker, no serialization layer, no configuration.

To orchestrate multiple processes with session discovery, control routing, and e-stop propagation, see Launch System.

// simplified
// Process 1: sensor.rs
let topic: Topic<Imu> = Topic::new("imu")?;
topic.send(imu_reading);

// Process 2: controller.rs
let topic: Topic<Imu> = Topic::new("imu")?;  // same name = same topic
if let Some(reading) = topic.recv() {
    // Got it — zero-config, sub-microsecond
}

How It Works

When you call Topic::new("imu"), HORUS creates (or opens) a shared memory region. Any process on the same machine that calls Topic::new("imu") with the same type connects to the same underlying ring buffer. The shared memory backend is managed by horus_sys — you never configure paths manually.

HORUS auto-detects whether a topic is same-process or cross-process and picks the fastest path:

ScenarioLatencyHow It Works
Same thread~3nsDirect pointer handoff
Same process, 1:1~18nsLock-free single-producer/single-consumer ring buffer
Same process, 1:N~24nsBroadcast to multiple in-process subscribers
Same process, N:1~26nsMultiple in-process publishers, one subscriber
Same process, N:N~36nsFull many-to-many in-process
Cross-process, POD type~50nsZero-copy shared memory (no serialization)
Cross-process, N:1~65nsShared memory, multiple publishers
Cross-process, 1:N~70nsShared memory, multiple subscribers
Cross-process, 1:1~85nsShared memory, serialized type
Cross-process, N:N~91nsShared memory, contention-free fan-out

Cross-process adds ~30-130ns vs in-process — still sub-microsecond. You don't configure any of this. The backend is selected automatically based on topology and upgrades transparently as participants join or leave.


Running Multiple Processes

Option 1: horus run with Multiple Files

# Builds and runs both files as separate processes
horus run sensor.rs controller.rs

# Mixed languages work too
horus run sensor.py controller.rs

# With release optimizations
horus run -r sensor.rs controller.rs

horus run compiles each file, then launches all processes and manages their lifecycle (SIGTERM on Ctrl+C, etc.).

Option 2: Separate Terminals

Run each node in its own terminal:

# Terminal 1
horus run sensor.rs

# Terminal 2
horus run controller.rs

Topics auto-discover via shared memory. No coordination needed.

Option 3: horus launch (YAML)

For production, declare your multi-process layout in a launch file:

# launch.yaml
nodes:
  - name: sensor
    cmd: horus run sensor.rs

  - name: controller
    cmd: horus run controller.rs

  - name: monitor
    cmd: horus run monitor.py
horus launch launch.yaml

Example: Two-Process Sensor Pipeline

Process 1sensor.rs:

// simplified
use horus::prelude::*;

message! {
    WheelEncoder {
        left_ticks: i64,
        right_ticks: i64,
        timestamp_ns: u64,
    }
}

struct EncoderNode {
    publisher: Topic<WheelEncoder>,
    ticks: i64,
}

impl EncoderNode {
    fn new() -> Result<Self> {
        Ok(Self {
            publisher: Topic::new("wheel.encoders")?,
            ticks: 0,
        })
    }
}

impl Node for EncoderNode {
    fn name(&self) -> &str { "Encoder" }

    fn tick(&mut self) {
        self.ticks += 10;
        self.publisher.send(WheelEncoder {
            left_ticks: self.ticks,
            right_ticks: self.ticks + 2,
            timestamp_ns: horus::now_ns(),
        });
    }
}

fn main() -> Result<()> {
    let mut sched = Scheduler::new().tick_rate(100_u64.hz());
    sched.add(EncoderNode::new()?).order(0).build()?;
    sched.run()?;
    Ok(())
}

Process 2odometry.rs:

// simplified
use horus::prelude::*;

message! {
    WheelEncoder {
        left_ticks: i64,
        right_ticks: i64,
        timestamp_ns: u64,
    }
}

struct OdometryNode {
    encoder_sub: Topic<WheelEncoder>,
    odom_pub: Topic<Odometry>,
    last_left: i64,
    last_right: i64,
}

impl OdometryNode {
    fn new() -> Result<Self> {
        Ok(Self {
            encoder_sub: Topic::new("wheel.encoders")?,
            odom_pub: Topic::new("odom")?,
            last_left: 0,
            last_right: 0,
        })
    }
}

impl Node for OdometryNode {
    fn name(&self) -> &str { "Odometry" }

    fn tick(&mut self) {
        if let Some(enc) = self.encoder_sub.recv() {
            let dl = enc.left_ticks - self.last_left;
            let dr = enc.right_ticks - self.last_right;
            self.last_left = enc.left_ticks;
            self.last_right = enc.right_ticks;

            println!("[Odom] delta L={} R={}", dl, dr);
        }
    }
}

fn main() -> Result<()> {
    let mut sched = Scheduler::new().tick_rate(100_u64.hz());
    sched.add(OdometryNode::new()?).order(0).build()?;
    sched.run()?;
    Ok(())
}

Run them:

# Terminal 1
horus run sensor.rs

# Terminal 2
horus run odometry.rs

The WheelEncoder messages flow through shared memory at ~50ns latency, with zero configuration.


When to Use Multi-Process

FactorSingle ProcessMulti-Process
Latency~3-36ns (intra-process)~50-171ns (cross-process)
DeterminismFull control via scheduler orderingEach process has its own scheduler
IsolationA crash takes down everythingA crash is contained to one process
LanguagesSingle language per binaryMix Rust + Python freely
RestartMust restart everythingRestart one process independently
DebuggingSingle debugger sessionAttach debugger to one process
DeploymentOne binary to deployMultiple binaries
ComplexitySimplerMore moving parts

Use single-process when:

  • All nodes are the same language
  • You need deterministic ordering between nodes (e.g., sensor → controller → actuator)
  • Latency matters at the nanosecond level
  • Simpler deployment is preferred

Use multi-process when:

  • Mixing Rust and Python (e.g., Rust motor control + Python ML inference)
  • Process isolation is needed (safety-critical separation)
  • Independent restart required (update one node without stopping others)
  • Different update rates or lifecycle requirements

Introspection

HORUS CLI tools work across processes automatically:

# See all topics (from any process)
horus topic list

# Monitor a topic published by another process
horus topic echo wheel.encoders

# See all running nodes across processes
horus node list

# Check bandwidth across processes
horus topic bw wheel.encoders

Cleaning Up

Shared memory files persist after processes exit. Clean them with:

horus clean --shm    # Remove stale shared memory regions

In practice, you rarely need this — HORUS automatically cleans stale SHM on every horus CLI command and every Scheduler::new() call. The manual command is an escape hatch for debugging.


What Happens When a Process Crashes

When a process dies (even via SIGKILL or power loss):

  1. SHM files persist — the kernel closes the file descriptor and releases flock locks, but the mmap'd file stays on disk
  2. Other processes continue — subscribers see dropped_count() increase if the publisher was mid-write, but they don't crash
  3. Backend auto-migrates — when the crashed process restarts and reconnects, the topic detects the new participant and migrates the backend (e.g., from 1:1 to 1:N) within ~10μs
  4. Automatic cleanup — the next horus CLI command or Scheduler::new() call auto-cleans stale namespaces (<1ms). No manual intervention needed.
# Process 1 crashes
# Process 2 keeps running, reading stale data from the ring buffer
# Process 1 restarts
# Process 2 sees fresh data again — no reconfiguration needed

Type mismatches: If a restarted process changes its message type (e.g., from CmdVel to Twist), the join fails with an error. Both processes must use the same message type for the same topic name.


Mixed-Language Multi-Process

The most common multi-process pattern: Rust for control loops, Python for ML inference.

Rust sensor node (sensor.rs):

// simplified
use horus::prelude::*;

struct CameraNode {
    pub_img: Topic<Image>,
}

impl CameraNode {
    fn new() -> Result<Self> {
        Ok(Self { pub_img: Topic::new("camera.rgb")? })
    }
}

impl Node for CameraNode {
    fn name(&self) -> &str { "camera" }
    fn tick(&mut self) {
        let mut img = Image::new(640, 480, "rgb8");
        // ... capture from hardware ...
        self.pub_img.send(img);
    }
}

fn main() -> Result<()> {
    let mut sched = Scheduler::new().tick_rate(30_u64.hz());
    sched.add(CameraNode::new()?).order(0).build()?;
    sched.run()
}

Run both:

# Terminal 1 (Rust, 30 FPS camera)
horus run sensor.rs

# Terminal 2 (Python, ML inference)
horus run detector.py

The Image flows through shared memory pool-backed transport — the Python node gets a zero-copy view of the pixels the Rust node wrote. No serialization, no copying.


Debugging Multi-Process Systems

Identify which process owns what

horus topic list --verbose
# Shows publisher/subscriber PIDs per topic

horus node list
# Shows all running nodes across all processes with PID, rate, CPU, memory

Watch cross-process data flow

# Monitor messages from the Rust sensor in the Python process's terminal
horus topic echo camera.rgb

# Measure the actual publishing rate
horus topic hz camera.rgb

# Measure bandwidth
horus topic bw camera.rgb

Debug one process at a time

# Start the sensor normally
horus run sensor.rs

# Start the detector with verbose logging
RUST_LOG=debug horus run detector.py

Use the monitor for a system-wide view

horus monitor
# Web UI at http://localhost:3000 shows ALL nodes from ALL processes
# Topic graph view shows cross-process message flow

Common debugging workflow

  1. horus topic list — verify both processes see the same topics
  2. horus topic hz <topic> — verify the publisher is sending at expected rate
  3. horus topic echo <topic> — verify message content is correct
  4. horus node list — verify both nodes are Running (not Error or Crashed)
  5. horus bb --anomalies — check for deadline misses or errors

Common Errors

ErrorCauseFix
Topics not visible across processesDifferent SHM namespacesSet HORUS_NAMESPACE=shared in both terminals, or use horus launch
Type mismatch on topic joinProcess A uses CmdVel, Process B uses different type for same nameEnsure both processes use the exact same message type
Stale data after crashSHM files persist after process deathUsually auto-cleaned on next horus run. Manual: horus clean --shm
Topic not found in CLICLI uses a different namespace than the running appRun CLI in same terminal or set matching HORUS_NAMESPACE
High dropped_countSubscriber process is slower than publisherIncrease subscriber rate, reduce publisher rate, or increase topic capacity
Permission denied on SHMDifferent users running processesRun both as the same user, or check /dev/shm permissions

Design Decisions

Why Auto-Discovery via Shared Memory Names

When you call Topic::new("imu") in two separate processes, both connect to the same shared memory region because the topic name deterministically maps to a shared memory path (managed by horus_sys). There is no registration step, no discovery protocol, and no configuration file listing topic endpoints. This works because shared memory is a kernel-level namespace — any process on the same machine that opens the same named region gets the same memory. Auto-discovery eliminates an entire class of misconfiguration bugs ("I forgot to register my topic") and means processes can start and stop in any order.

Why No Broker Process

Message brokers (like DDS in ROS2 or MQTT) add a routing hop between every publisher and subscriber. Even with optimizations, this hop adds latency and creates a single point of failure — if the broker crashes, all communication stops. HORUS uses direct shared memory: publishers write to a ring buffer, subscribers read from it, and no intermediary process routes messages. This gives sub-microsecond latency and means there is no central process that can fail. The cost is that HORUS topics only work on a single machine (cross-machine communication requires an explicit bridge).

Why Transparent Same-Process vs Cross-Process Selection

HORUS automatically detects whether a publisher and subscriber are in the same process or different processes and selects the fastest transport: direct pointer handoff (~3ns) for same-thread, lock-free ring buffer (~18ns) for same-process, or shared memory (~50ns) for cross-process. Users write the same Topic::new("name") call regardless. This means code that works in a single-process prototype deploys to a multi-process production system with zero changes. The transport upgrades and downgrades transparently as participants join and leave — splitting a monolith into separate processes does not require code changes.

Trade-offs

AreaBenefitCost
Auto-discoveryZero configuration; processes connect by topic name alone; start/stop in any orderNo explicit topology — harder to audit which processes are connected without horus topic list
No brokerSub-microsecond latency; no single point of failure; no extra process to deploySingle-machine only — cross-machine communication requires an explicit network bridge
Transparent transportSame code works in single-process and multi-process; zero migration costUsers cannot force a specific transport backend; automatic selection may surprise during debugging
Process isolationOne crash does not take down the system; independent restart and upgradeHigher baseline latency (~50ns cross-process vs ~3ns same-thread); shared memory files persist after exit and need cleanup
Shared memory persistenceFast reconnection — no handshake needed when a process restartsStale files from crashes are auto-cleaned on next startup; horus clean --shm for manual override
Independent schedulersEach process can run at its own tick rate with its own orderingNo cross-process deterministic ordering — sensor-to-actuator chains across processes depend on timing, not scheduler order

See Also