Kubernetes Auto-Scaling using External Custom Metrics
Kubernetes Auto-Scaling using External Custom Metrics
Traditional Kubernetes Horizontal Pod Autoscaling (HPA) typically scales based on CPU and memory usage. However, depending on their purpose, some applications are not suited to scaling based purely on CPU or memory resource usage. This guide demonstrates how to scale your Kubernetes workloads based on custom metrics like requests per second (RPS).
Why Custom Metrics?
While CPU and memory metrics work well for many applications, some workloads benefit from scaling on application-specific metrics:
- Web applications: Scale on requests per second or response time
- Message queue consumers: Scale on queue depth
- API services: Scale on active connections or request latency
Architecture Overview
In this guide, we'll use:
- Nginx Ingress Controller to expose metrics
- Prometheus-to-sd sidecar to export metrics to Stackdriver
- Stackdriver adapter to make metrics available to Kubernetes HPA
- Horizontal Pod Autoscaler to scale based on the custom metric
Prerequisites
Before starting, ensure you have:
- A Kubernetes cluster (GKE recommended)
kubectlcommand-line tool configured- Helm and Tiller installed
- Stackdriver adapter deployed
Step 1: Deploy Stackdriver Adapter
The Stackdriver adapter enables Kubernetes to use custom metrics from Google Cloud Monitoring (formerly Stackdriver):
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml
Verify the adapter is running:
kubectl get pods -n custom-metrics
Step 2: Configure Nginx Ingress Controller
Create a values file for the Nginx Ingress Controller that includes the Prometheus-to-sd sidecar container. This sidecar exports metrics from Nginx to Stackdriver:
# nginx-values.yaml
controller:
extraContainers:
- name: prometheus-to-sd
image: gcr.io/google-containers/prometheus-to-sd:v0.9.0
command:
- /monitor
- --stackdriver-prefix=custom.googleapis.com
- --source=nginx-ingress-controller:http://localhost:10254/metrics
- --pod-id=$(POD_NAME)
- --namespace-id=$(POD_NAMESPACE)
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
metrics:
enabled: true
service:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "10254"
Step 3: Deploy Nginx Ingress Controller
Install the Nginx Ingress Controller using Helm with the custom values:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install nginx-ingress ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--create-namespace \
--values nginx-values.yaml
Verify the deployment:
kubectl get pods -n ingress-nginx
kubectl get svc -n ingress-nginx
Step 4: Configure Horizontal Pod Autoscaler
Now create an HPA that scales based on requests per second. This example scales your deployment when RPS exceeds 1000:
# hpa-custom-metrics.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: your-app-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: External
external:
metric:
name: custom.googleapis.com|nginx-ingress-controller|nginx_ingress_controller_requests
selector:
matchLabels:
resource.type: k8s_pod
resource.labels.namespace_name: ingress-nginx
target:
type: AverageValue
averageValue: "1000"
Apply the HPA:
kubectl apply -f hpa-custom-metrics.yaml
Step 5: Verify the Setup
Check that the HPA can read the custom metrics:
kubectl get hpa app-hpa
kubectl describe hpa app-hpa
You should see output showing the current metric value and target:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
app-hpa Deployment/your-app 850/1000 2 10 2
Monitoring and Troubleshooting
View Available Custom Metrics
List all custom metrics available in your cluster:
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq .
Check Nginx Metrics
Access the Nginx Ingress Controller metrics endpoint:
kubectl port-forward -n ingress-nginx deployment/nginx-ingress-controller 10254:10254
curl http://localhost:10254/metrics
Debug HPA Issues
If the HPA shows "unknown" for metrics:
- Verify Stackdriver adapter is running
- Check that metrics are being exported to Stackdriver
- Ensure the metric name matches exactly in your HPA configuration
- Review adapter logs:
kubectl logs -n custom-metrics deployment/custom-metrics-stackdriver-adapter
Best Practices
- Set appropriate thresholds: Test your application to determine realistic scaling thresholds
- Use stabilization windows: Configure
behaviorin HPA to prevent flapping - Monitor costs: Scaling on custom metrics can increase GCP monitoring costs
- Combine metrics: Consider using multiple metrics (CPU + RPS) for more robust scaling
Conclusion
Custom metrics-based auto-scaling provides more precise control over your Kubernetes deployments. By scaling on application-specific metrics like requests per second, you can ensure your services maintain performance under varying loads while optimizing resource usage.
This approach is particularly valuable for:
- High-traffic web applications
- API services with variable load patterns
- Applications where CPU/memory doesn't correlate with actual load
Remember to monitor your applications and adjust thresholds as needed to find the optimal scaling configuration for your workload.