Unified Fabric Manager (UFM)
NVIDIA Unified Fabric Manager (UFM) is the management platform for InfiniBand switch fabrics. Bridge integrates with UFM to automate IB fabric discovery, topology validation, and per-tenant PKEY provisioning.
UFM Onboarding
During Day 0 network setup, the NCP Admin onboards the UFM appliance into Bridge as part of the network discovery workflow:
- Bridge connects to the UFM appliance over the OOB network.
- UFM discovers the entire InfiniBand fabric — all switches, compute nodes, and inter-switch links.
- Bridge imports the discovered topology and registers the IB fabric in the infrastructure inventory.
- Topology validation is optionally triggered to confirm the discovered fabric matches the intended design.
The UFM appliance must be reachable from the Bridge management server over the OOB network before onboarding.
Fabric Discovery
UFM uses LLDP-based discovery to map the InfiniBand topology:
- Identifies all IB switches and their inter-switch links.
- Discovers compute node IB interfaces and their Globally Unique Identifiers (GUIDs).
- Builds a complete topology map of the CLOS fabric including spine and leaf layers.
Bridge stores the discovered topology and uses it for PKEY assignment and compute allocation decisions.
Partition Key (PKEY) Management
Tenant PKEY Creation
When the NCP Admin creates a new tenant in Bridge, UFM automatically provisions a unique Partition Key (PKEY) for that tenant. PKEYs enforce isolated virtual networks within the IB fabric — only members of the same PKEY can communicate.
Compute Allocation and GUID Registration
When a compute instance (bare metal or virtual machine) is allocated to a tenant, Bridge registers the compute node's IB interface GUIDs to the tenant's PKEY:
| Step | Action |
|---|---|
| Tenant created | UFM provisions a unique PKEY |
| Compute allocated | Bridge registers compute node GUIDs to tenant PKEY |
| Compute deallocated | Bridge removes GUIDs from tenant PKEY |
This ensures that allocated compute resources can communicate within the tenant's isolated IB network, and that deallocation immediately revokes IB network membership.
PKEY to Converged Network Mapping
Each tenant's PKEY is mapped to a corresponding Converged VLAN ID or VRF, enabling connectivity between the InfiniBand compute fabric and the Ethernet Converged Network (storage, in-band management, and external access).
Topology Validation
Bridge exposes UFM topology validation to the NCP Admin. When triggered, UFM verifies that:
- The physical IB fabric matches the expected topology design.
- All inter-switch links are active and correctly connected.
- No unexpected topology deviations are present.
Topology validation can identify cabling errors, failed links, and configuration drift before they affect tenant workloads.
Related Pages
- InfiniBand Overview — IB network isolation architecture and UFM integration overview
- Networking Overview — Full tenant network isolation model including Ethernet and InfiniBand