LNet Multi-Rail Failover Examples
LNet Multi-Rail (MR) enables failover across multiple network interfaces or networks for high availability and bandwidth aggregation in Lustre. This guide provides examples for Lustre 2.17.0 (as of January 2026), focusing on failover setups. Failover occurs automatically via health monitoring (introduced in 2.12, enhanced in 2.15+). Configurations apply similarly to clients and servers, with servers often using shared storage for OST/MDT failover. Based on the Lustre Operations Manual (updated 2025) and community resources like OpenFabrics presentations (2020) and troubleshooting guides (2024). For full details, see Lustre Manual.
Failover Mechanisms
- LNet-Level: Traffic shifts to healthy interfaces on failure (e.g., link down, timeout). Health scores (0-1000) degrade on errors; recovery via pings.
- Filesystem-Level: Clients retry I/O (default) or failout with EIO. Servers use service node failover with shared storage.
- Dynamic Discovery (2.11+): Peers exchange NIDs automatically.
- Asymmetrical Routes (2.13+): Different send/receive paths for flexibility.
- Health Monitoring: Tune sensitivity and intervals for proactive failover.
Configuration Examples
Basic Multi-Rail Setup with Failover (Single Network, Multiple Interfaces)
Node with eth0 and eth1 on tcp0; failover between interfaces.
# Add network with multiple interfaces
lnetctl net add --net tcp0 --if eth0,eth1 --peer_timeout 180 --peer_credits 8
# Verify
lnetctl net show --net tcp0 --verbose
Multi-Network Multi-Rail with Failover
Node with tcp0 (eth0), tcp1 (eth1); colon for failover NIDs.
# Add networks
lnetctl net add --net tcp0 --if eth0
lnetctl net add --net tcp1 --if eth1
# Add peer with failover NIDs
lnetctl peer add --prim_nid 10.10.10.2@tcp0 --nid 10.10.3.3@tcp1:10.10.4.4@tcp1
# Show peers
lnetctl peer show -v
OST/MDT Service Node Failover (Server-Specific)
Configure primary and failover nodes with shared storage.
# On MDS (MDT/MGS)
mkfs.lustre --fsname=testfs --mdt --mgs --servicenode=192.168.10.2@tcp0 --servicenode=192.168.10.1@tcp0 /dev/sda1
# On OSS (OST)
mkfs.lustre --fsname=testfs --servicenode=192.168.10.20@tcp0 --servicenode=192.168.10.21@tcp0 --ost --index=0 --mgsnode=192.168.10.1@tcp0 --mgsnode=192.168.10.2@tcp0 /dev/sdb
# Client mount with failover
mount -t lustre 192.168.10.1@tcp0:192.168.10.2@tcp0:/testfs /mnt/testfs
# Replace NIDs dynamically (2.11+)
lctl replace_nids testfs-OST0000 192.168.10.20@tcp0,192.168.10.21@tcp0
Router Resiliency with Multiple Gateways
Multiple routes for failover between networks.
# Add routes with priorities
lnetctl route add --net tcp2 --gateway 192.168.205.130@tcp1 --hop 2 --prio 1
lnetctl route add --net tcp2 --gateway 192.168.205.131@tcp1 --hop 2 --prio 2
# Enable asymmetrical routes (2.13+)
lnetctl set drop_asym_route 0
# Show routes
lnetctl route show --verbose
Health Monitoring for Proactive Failover
# Tune health
lnetctl set health_sensitivity 100 # Higher = more sensitive to failures
lnetctl set recovery_interval 1 # Ping interval in seconds
lnetctl set retry_count 3 # Retries before marking down
# Enable router checks
lnetctl set check_routers_before_use 1
lnetctl set alive_router_check_interval 60
# Monitor
lnetctl net show -v 3
YAML Configuration Examples
Multi-Rail Network with Failover Interfaces
# net_failover.yaml
net:
- net: tcp0
interfaces:
0: eth0
1: eth1
tunables:
peer_timeout: 180
peer_credits: 8
Apply: lnetctl import net_failover.yaml
Peer with Failover NIDs
# peer_failover.yaml
peer:
- primary_nid: 10.10.10.2@tcp
Multi-Rail: True
peer_ni:
- nid: 10.10.3.3@tcp1:10.10.4.4@tcp1
Route with Multiple Gateways for Failover
# route_failover.yaml
route:
- net: tcp2
gateway: 192.168.205.130@tcp1
hop: 2
priority: 1
- net: tcp2
gateway: 192.168.205.131@tcp1
hop: 2
priority: 2
Best Practices
- Use ip2nets in module params for initial setup; lnetctl for dynamic changes.
- Enable discovery:
lnetctl set discovery 1. - Test failover: Unplug interface, check
lctl get_param nisorlnetctl peer show. - For servers: Use --servicenode in mkfs; shared storage essential.
- Monitor: Categorize failures (local/remote) via stats; adjust sensitivities.
- Client vs Server: Clients focus on I/O retry; servers on active/passive HA.
For troubleshooting, see recent guides (e.g., OpenSFS 2024 on remote vs local failures). Check JIRA for updates (e.g., LU-19763 on TCP improvements).