Skip to main content

Cluster Scaling

Overview

Cluster scaling in Armada Bridge lets you increase or decrease worker nodes in an existing cluster without recreating the cluster.

  • Scale up adds worker nodes to increase capacity for higher workloads.
  • Scale down removes worker nodes to reduce resource usage when demand decreases.

Scaling changes only the worker node count. Existing cluster configuration, networking, and workloads remain intact.

Scaling Strategies

Horizontal Scaling

Add or remove entire nodes from your cluster:

  • Best for workload changes
  • Distributes load across nodes
  • Maintains performance characteristics

Vertical Scaling

Adjust resources within existing nodes:

  • Modify memory allocation
  • Change CPU allocation
  • Adjust GPU assignment

Scale Cluster Up

Prerequisites

  • Available resources in your quota
  • Running cluster to scale
  • Monitoring data to inform scaling decision

Prerequisites

  • Tenant Admin access — Log in as a Tenant Admin to scale clusters.
  • Running cluster — The cluster must be in a healthy running state.
  • Available resources for scale up — Ensure sufficient Bare Metal/VM resources are allocated to the tenant before increasing worker count.
  • Workload readiness for scale down — Verify workloads can tolerate reduced worker capacity before decreasing node count.

Steps to Scale Up

Step 1: Update Worker Count

  1. Click the cluster name or the ellipsis (three-dot) menu, then click Scale Cluster.

    Scale Up Option

  2. Set a worker node count that is greater than the current count.

  3. Choose Node Type (Bare Metal or Virtual Machine), then click Save Changes.

note

Worker node count must not exceed your allocated Bare Metal/Virtual Machine resources.

Scale Up Count

Step 2: Confirm and Monitor Scale Up

  1. Click Confirm to start scale up.

    Scale Up Confirmation

  2. Monitor cluster status. During scale up, status shows Scaling ↑.

    Scale Up State

  3. After scale up completes, confirm worker node count has increased (for example, from 1 to 2).

    Scale Up Node

Step 3: Verify Cluster Health

  1. Open Kubectl Terminal.

  2. Verify the cluster is healthy and ready for workloads.

    Scale Up K8s Terminal

New nodes will be:

  1. Initialized
  2. Joined to cluster
  3. Configured with software
  4. Made available for workloads

Scale Cluster Down

Considerations Before Scaling Down

  • Ensure workloads can migrate safely
  • Drain jobs from nodes to be removed
  • Verify no persistent data will be lost
  • Plan for temporary service interruption

Step 1: Update Worker Count

  1. Click the cluster name or the ellipsis (three-dot) menu, then click Scale Cluster.

    Scale Down Option

  2. Set a worker node count that is less than the current count.

  3. Click Save Changes.

    Scale Down Pop Up

Step 2: Confirm and Monitor Scale Down

  1. Click Confirm on the confirmation pop up.

    Scale Down Confirmation

  2. Monitor cluster status. During scale down, status shows Scaling ↓.

    Scale Down State

  3. After scale down completes, confirm worker node count is reduced (for example, from 2 to 1).

    Scale Down Worker

Step 3: Verify Cluster Health

  1. Open Kubectl Terminal.

  2. Confirm the cluster is healthy after node removal.

    Scale Down K8s Terminal

Auto-Scaling (If Available)

Configure Auto-Scaling

Some clusters support automatic scaling:

  1. Navigate to Cluster Settings
  2. Select Auto-Scaling
  3. Configure parameters:

Auto-Scaling Config

  • Minimum nodes
  • Maximum nodes
  • Scale-up threshold
  • Scale-down threshold
  • Cooldown period

How Auto-Scaling Works

  • Monitors resource utilization
  • Automatically adds nodes when threshold exceeded
  • Automatically removes nodes when underutilized
  • Respects min/max limits

Best Practices

Scaling Decisions

  • Monitor trends before scaling
  • Scale gradually in response to demand
  • Keep buffer capacity for spikes
  • Document scaling decisions

Timing

  • Scale during low-usage windows for scale-down
  • Plan scale-up ahead of known demand
  • Avoid scaling during critical operations
  • Allow time for node initialization

Cost Optimization

  • Scale down during off-peak hours
  • Use right-sized node types
  • Monitor actual vs allocated resources
  • Adjust quotas based on patterns

Kubernetes Specific

HPA (Horizontal Pod Autoscaler)

For pod-level scaling in Kubernetes:

kubectl autoscale deployment <deployment> --min=2 --max=10

VPA (Vertical Pod Autoscaler)

For resource recommendation:

kubectl apply -f vpa-config.yaml

Next Steps