Kubeflow
Overview
The Kubeflow template provisions a complete Machine Learning operations (MLOps) platform on your Kubernetes cluster. This deployment includes the full Kubeflow ecosystem including Central Dashboard, Notebooks, Pipelines, and Katib
Accessing the cluster
Bridge provides two ways to work with your cluster after it is created:
-
Download kubeconfig — You can download the cluster kubeconfig file from the cluster menu. Use this to access the cluster from your local machine or external tools by setting KUBECONFIG.
-
Kubectl Terminal — Interact directly with the Kubernetes cluster from the Bridge UI. Use this to:
- Run kubectl commands directly from the UI to add any additional kubeflow config
- Monitor the 50+ microservices that make up the Kubeflow stack
Prerequisites
- Tenant Admin access — Log in as a Tenant Admin to create clusters.
- Compute resources — Minimum 16GB RAM and 4 CPUs per node recommended for Kubeflow.
- Port-forward on Bridge node — For cluster creation and initial access to succeed, run:
Run the port forwarding on the Bridge node so that the IngressGateway is reachable.
COMMAND: nohup kubectl -n istio-system port-forward --address 0.0.0.0 svc/istio-ingressgateway 443:443 > pf.log 2>&1 &
Create a Kubeflow Cluster
Step 1: Start Cluster Creation
- Log in to Armada Bridge as a Tenant Admin.
- In the left sidebar, click Compute → Cluster.
- Click Create Cluster.
Step 2: Configure Cluster Details
- Select the Kubernetes version (v1.25 or higher is recommended).
- Select the CNI plugin (Cilium is preferred for Istio compatibility).
- (Optional) Enable Install NVIDIA GPU tools to enable GPU-accelerated Notebooks and Training.
- Click Next.
Step 3: Select Cluster Template
- Choose Kubeflow Template.
- Click Next.
Step 4: Select Nodes and Create
- Select the cluster node(s) (Bare Metal or Virtual Machine).
- Click Create to start cluster creation.
Step 5: Monitor Cluster Creation
Wait until the status is Running.
Note: Kubeflow deployment is heavy. Even after the cluster is "Running," it may take an additional 5-10 minutes for all Kubeflow components to initialize.
Post-Deployment Configuration
Step 6: Map to Hostname
If this domain is not resolvable via your corporate or public DNS, you must manually point your local machine to the cluster Ingress by adding an entry to your /etc/hosts file:
<VM_PUBLIC_IP> kubeflow.armada.ai
Replace <VM_PUBLIC_IP> with the public IP of the node or the LoadBalancer IP provided in the Cluster Overview.
kubectl get pods -n kubeflow
Step 7: Access the Dashboard
- Open your web browser and navigate to https://kubeflow.armada.ai.
- Log in using your tenant credentials (Dex/OIDC). Contact administrator for login credentials.
- Upon successful authentication, you will be redirected to the Kubeflow Central Dashboard.
To Learn more about how to use kubeflow for MLOps refer official Kubeflow documentation at :