LNet Multi-Rail Failover Examples

LNet Multi-Rail (MR) enables failover across multiple network interfaces or networks for high availability and bandwidth aggregation in Lustre. This guide provides examples for Lustre 2.17.0 (as of January 2026), focusing on failover setups. Failover occurs automatically via health monitoring (introduced in 2.12, enhanced in 2.15+). Configurations apply similarly to clients and servers, with servers often using shared storage for OST/MDT failover. Based on the Lustre Operations Manual (updated 2025) and community resources like OpenFabrics presentations (2020) and troubleshooting guides (2024). For full details, see Lustre Manual.

Failover Mechanisms

LNet-Level: Traffic shifts to healthy interfaces on failure (e.g., link down, timeout). Health scores (0-1000) degrade on errors; recovery via pings.
Filesystem-Level: Clients retry I/O (default) or failout with EIO. Servers use service node failover with shared storage.
Dynamic Discovery (2.11+): Peers exchange NIDs automatically.
Asymmetrical Routes (2.13+): Different send/receive paths for flexibility.
Health Monitoring: Tune sensitivity and intervals for proactive failover.

Configuration Examples

Basic Multi-Rail Setup with Failover (Single Network, Multiple Interfaces)

Node with eth0 and eth1 on tcp0; failover between interfaces.

# Add network with multiple interfaces
lnetctl net add --net tcp0 --if eth0,eth1 --peer_timeout 180 --peer_credits 8

# Verify
lnetctl net show --net tcp0 --verbose

Multi-Network Multi-Rail with Failover

Node with tcp0 (eth0), tcp1 (eth1); colon for failover NIDs.

# Add networks
lnetctl net add --net tcp0 --if eth0
lnetctl net add --net tcp1 --if eth1

# Add peer with failover NIDs
lnetctl peer add --prim_nid 10.10.10.2@tcp0 --nid 10.10.3.3@tcp1:10.10.4.4@tcp1

# Show peers
lnetctl peer show -v

OST/MDT Service Node Failover (Server-Specific)

Configure primary and failover nodes with shared storage.

# On MDS (MDT/MGS)
mkfs.lustre --fsname=testfs --mdt --mgs --servicenode=192.168.10.2@tcp0 --servicenode=192.168.10.1@tcp0 /dev/sda1

# On OSS (OST)
mkfs.lustre --fsname=testfs --servicenode=192.168.10.20@tcp0 --servicenode=192.168.10.21@tcp0 --ost --index=0 --mgsnode=192.168.10.1@tcp0 --mgsnode=192.168.10.2@tcp0 /dev/sdb

# Client mount with failover
mount -t lustre 192.168.10.1@tcp0:192.168.10.2@tcp0:/testfs /mnt/testfs

# Replace NIDs dynamically (2.11+)
lctl replace_nids testfs-OST0000 192.168.10.20@tcp0,192.168.10.21@tcp0

Router Resiliency with Multiple Gateways

Multiple routes for failover between networks.

# Add routes with priorities
lnetctl route add --net tcp2 --gateway 192.168.205.130@tcp1 --hop 2 --prio 1
lnetctl route add --net tcp2 --gateway 192.168.205.131@tcp1 --hop 2 --prio 2

# Enable asymmetrical routes (2.13+)
lnetctl set drop_asym_route 0

# Show routes
lnetctl route show --verbose

Health Monitoring for Proactive Failover

# Tune health
lnetctl set health_sensitivity 100  # Higher = more sensitive to failures
lnetctl set recovery_interval 1      # Ping interval in seconds
lnetctl set retry_count 3            # Retries before marking down

# Enable router checks
lnetctl set check_routers_before_use 1
lnetctl set alive_router_check_interval 60

# Monitor
lnetctl net show -v 3

YAML Configuration Examples

Multi-Rail Network with Failover Interfaces

# net_failover.yaml
net:
  - net: tcp0
    interfaces:
      0: eth0
      1: eth1
    tunables:
      peer_timeout: 180
      peer_credits: 8

Apply: lnetctl import net_failover.yaml

Peer with Failover NIDs

# peer_failover.yaml
peer:
  - primary_nid: 10.10.10.2@tcp
    Multi-Rail: True
    peer_ni:
      - nid: 10.10.3.3@tcp1:10.10.4.4@tcp1

Route with Multiple Gateways for Failover

# route_failover.yaml
route:
  - net: tcp2
    gateway: 192.168.205.130@tcp1
    hop: 2
    priority: 1
  - net: tcp2
    gateway: 192.168.205.131@tcp1
    hop: 2
    priority: 2

Best Practices

Use ip2nets in module params for initial setup; lnetctl for dynamic changes.
Enable discovery: lnetctl set discovery 1.
Test failover: Unplug interface, check lctl get_param nis or lnetctl peer show.
For servers: Use --servicenode in mkfs; shared storage essential.
Monitor: Categorize failures (local/remote) via stats; adjust sensitivities.
Client vs Server: Clients focus on I/O retry; servers on active/passive HA.

For troubleshooting, see recent guides (e.g., OpenSFS 2024 on remote vs local failures). Check JIRA for updates (e.g., LU-19763 on TCP improvements).