Bridge Tenant Guide
Purpose
Bridge provides self-service, on-demand access to secure, high-performance GPU infrastructure and AI platforms — without requiring tenants to manage hardware, networking, or cloud operations.
Overview
This guide provides comprehensive instructions for Tenants to manage and utilize resources within Bridge platform.
Key Capabilities
As a Tenant Admin or Tenant User (which are explained in following sections), you can:
- Allocate Resources - Allocate bare metal servers and virtual machines
- Create Clusters - Set up Slurm, JupyterHub, and Kubernetes clusters
- Deploy Models - Deploy ML/AI models for inference
- Configure GPU - Set up NVIDIA Multi-Instance GPU (MIG) profiles
- Manage Endpoints - Create endpoints for services
- Access Jupyter - Run interactive notebooks with GPU access
- Deploy Applications - Deploy custom workloads and applications
- Scale Infrastructure - Scale clusters up or down based on demand
Guide Structure
This guide is organized into the following sections:
- Dashboard - Overview of tenant resources
- Resource Allocation - Allocate servers and infrastructure
- Cluster Management - Create and manage different cluster types
- GPU Configuration - Configure MIG profiles
- Endpoints & Services - Create service endpoints
- Model Deployment - Deploy ML models
- Jupyter Access - Use Jupyter notebooks
- Workload Management - Deploy and manage workloads
- Application Management - Deploy applications
- Cleanup - Delete resources
Bridge — Tenant Overview
Tenants consume GPUs as a service with:
- Clear boundaries
- Predictable performance
- Full usage visibility
Tenant Roles
Bridge supports two tenant personas:
- Tenant Admin – Manages users, quotas, and services
- Tenant User – Consumes compute, platforms, and AI services
What Tenants Get
1. Isolated GPU Infrastructure
Each tenant receives infrastructure that is:
- Logically hard isolated
- Equivalent to a private GPU cloud
- Consistent across bare metal, VMs, and Kubernetes
Tenants do not share any infrastructure resources (GPUs, Networking, Storage etc).
2. Multiple Consumption Models
Bare Metal GPU Instances as a Service (BMaaS)
- On-demand Dedicated GPU servers
- Ideal for large training or regulated workloads
- Can form clusters or supercomputers
- Fully isolated from other Tenants
- Part of Tenant VPC and Subnet(s)
Virtual Machines with GPUs as a Service (VMaaS)
- On-demand GPU VMs
- GPU passthrough and fractional GPUs (MIG)
- Suitable for development and inference
- Fully isolated from other Tenants
- Part of Tenant VPC and Subnet(s)
Platform-as-a-Service (PaaS)
- Managed Kubernetes clusters
- Autoscaling based on GPU utilization
- Deploy Applications from a Catalog (Marketplace)
3. AI & Model Services (Self-Service)
Tenants can:
- Select models from curated catalogs (Hugging Face, NVIDIA NIM, private repositories)
- Deploy inference endpoints
- Run fine-tuning and batch jobs
- Deploy Jupyter Notebooks on KAI
All services are exposed through simple UI workflows and APIs.
Tenant Admin Capabilities
Tenant Admins manage:
- Tenant users and roles
- Usage limits and quotas
- Compute, clusters, and storage provisioning
- Model catalogs and integrations
- Monitoring, usage, performance, and cost tracking
- Alerts and billing controls
Physical infrastructure and shared fabric remain abstracted.
Tenant End User Experience
End Users can:
- Provision GPU instances or platforms
- Submit training, inference, or HPC jobs
- Access Jupyter Notebooks or LLM endpoints
- Monitor job status and resource usage
- Access logs and performance metrics
Users focus on workloads — not infrastructure.
Tenant Value Proposition
With Bridge, tenants gain:
- Predictable, isolated GPU access without CapEx
- Flexible infrastructure, platform, and AI services
- Faster time-to-model and time-to-inference
- Enterprise-grade security and observability
- A consistent experience across development, training, and production
Getting Started
Start with the Dashboard Overview to see your tenant resources, then proceed to allocate infrastructure and create your first cluster.