Kubernetes Horizontal Pod Autoscaler and resource management
# HPA v2 with multiple metrics
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: 1000
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 60
- type: Pods
value: 4
periodSeconds: 60
selectPolicy: Max
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
---
# Pod Disruption Budget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: web-app-pdb
namespace: production
spec:
minAvailable: 2
selector:
matchLabels:
app: web-app
---
# Resource Quota for namespace
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
pods: "100"
services: "20"
persistentvolumeclaims: "30"
---
# LimitRange for defaults
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: production
spec:
limits:
- type: Container
default:
cpu: 500m
memory: 256Mi
defaultRequest:
cpu: 100m
memory: 128Mi
max:
cpu: "2"
memory: 2Gi
min:
cpu: 50m
memory: 64Mi
The Horizontal Pod Autoscaler (HPA) automatically scales Pod replicas based on observed metrics. It monitors CPU utilization, memory usage, or custom metrics via the metrics-server. The minReplicas and maxReplicas fields set scaling bounds. targetCPUUtilizationPercentage triggers scaling when average CPU crosses the threshold. HPA v2 supports multiple metrics and custom metrics from Prometheus. The behavior field controls scale-up and scale-down rates to prevent flapping. Vertical Pod Autoscaler (VPA) adjusts resource requests and limits automatically. Pod Disruption Budgets (PDB) ensure minimum availability during voluntary disruptions. Resource quotas limit namespace-level resource consumption. LimitRanges set default and max resources per container.