Intermediate
How to set up horizontal pod scaling on DigitalOcean
Quick Answer
Horizontal Pod Autoscaler (HPA) on DigitalOcean Kubernetes automatically scales your application pods based on CPU, memory, or custom metrics. Enable the metrics server, create an HPA resource with target metrics, and Kubernetes will automatically add or remove pods to meet demand.
Prerequisites
- Active DigitalOcean Kubernetes cluster
- kubectl configured for your DOKS cluster
- Basic understanding of Kubernetes pods and deployments
- Application already deployed to the cluster
1
Verify metrics server is enabled
Connect to your DigitalOcean Kubernetes cluster and check if the metrics server is running:
If not found, enable it in your DOKS cluster settings in the DigitalOcean control panel under Kubernetes > Your Cluster > Settings > Add-ons and enable Metrics Server.
kubectl get deployment metrics-server -n kube-systemIf not found, enable it in your DOKS cluster settings in the DigitalOcean control panel under Kubernetes > Your Cluster > Settings > Add-ons and enable Metrics Server.
Tip
The metrics server is required for HPA to collect CPU and memory metrics from your pods.
2
Configure resource requests for your deployment
Edit your deployment to include CPU and memory resource requests, which are required for HPA:
Apply the changes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: my-app:latest
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512MiApply the changes:
kubectl apply -f deployment.yamlTip
Resource requests are mandatory for CPU-based scaling and recommended for memory-based scaling.
3
Create the Horizontal Pod Autoscaler
Create an HPA resource file
Apply the HPA:
hpa.yaml:apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70Apply the HPA:
kubectl apply -f hpa.yamlTip
Start with conservative scaling limits and adjust based on your application's performance and cost requirements.
4
Verify HPA status and configuration
Check that your HPA is active and collecting metrics:
View detailed HPA information:
You should see current CPU utilization, target utilization, and current replica count. It may take a few minutes for metrics to appear.
kubectl get hpa my-app-hpaView detailed HPA information:
kubectl describe hpa my-app-hpaYou should see current CPU utilization, target utilization, and current replica count. It may take a few minutes for metrics to appear.
5
Configure memory-based scaling (optional)
Add memory metrics to your HPA for more comprehensive scaling:
Apply the updated configuration:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80Apply the updated configuration:
kubectl apply -f hpa.yamlTip
Memory-based scaling helps prevent out-of-memory errors during traffic spikes.
6
Test the autoscaling behavior
Generate load on your application to test scaling. Use a load testing tool or create a temporary pod:
Inside the pod, generate requests:
Monitor the HPA response:
kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/shInside the pod, generate requests:
while true; do wget -q -O- http://my-app-service/; doneMonitor the HPA response:
kubectl get hpa my-app-hpa --watchTip
Scaling decisions typically take 15-30 seconds for scale-up and 5 minutes for scale-down to prevent flapping.
7
Monitor and adjust scaling parameters
Use DigitalOcean's monitoring dashboard to track your cluster performance under Kubernetes > Your Cluster > Insights. Adjust HPA parameters based on observed behavior:
Update your HPA configuration and reapply:
- Lower
averageUtilizationfor more aggressive scaling - Increase
maxReplicasif you hit the limit during peak traffic - Adjust
minReplicasbased on baseline traffic requirements
Update your HPA configuration and reapply:
kubectl apply -f hpa.yamlTip
Review scaling events regularly using kubectl describe hpa to optimize your thresholds.
Troubleshooting
HPA shows 'unknown' for current CPU utilization
Ensure the metrics server is running (
kubectl get pods -n kube-system | grep metrics-server) and your deployment has CPU resource requests defined. Wait 2-3 minutes after deployment for metrics to populate.Pods are not scaling up despite high CPU usage
Check if you've reached the
maxReplicas limit and verify your DigitalOcean node pool has sufficient capacity. Use kubectl describe nodes to check resource availability and consider adding more nodes if needed.Scaling happens too frequently (flapping)
Add stabilization windows to your HPA configuration:
This prevents rapid scaling changes.
behavior:
scaleUp:
stabilizationWindowSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300This prevents rapid scaling changes.
HPA cannot scale below minimum replicas during low traffic
This is expected behavior. If you want fewer replicas during off-peak hours, consider using Vertical Pod Autoscaler (VPA) alongside HPA or manually adjust
minReplicas in your HPA configuration for different time periods.Ready to get started with DigitalOcean?
Put this tutorial into practice. Visit DigitalOcean and follow the steps above.
Visit DigitalOcean →