How Auto-Scaling Works on Stackryze
A deep dive into how Stackryze automatically scales your applications based on traffic, and how to configure scaling policies.
Alex Kim
Stackryze Team
How Auto-Scaling Works on Stackryze
One of the most powerful features of Stackryze is automatic scaling. Your application scales up when traffic increases and scales down when it drops — all without manual intervention.
How It Works
Stackryze monitors three key metrics to make scaling decisions:
- CPU utilization — Target: 70% average across instances
- Memory usage — Target: 80% threshold
- Request queue depth — Target: < 100ms queue wait time
When any metric exceeds its threshold for 60 seconds, we spin up additional containers. When metrics drop, we gradually scale down with a 5-minute cooldown to prevent flapping.
Configuration
You can customize scaling behavior in your stackryze.yaml:
scaling:
min_instances: 1
max_instances: 20
target_cpu: 70
target_memory: 80
cooldown_seconds: 300
Scale to Zero
On the Starter plan, applications scale to zero after 15 minutes of inactivity. On Pro and Enterprise, you can configure minimum instances to keep your app warm.
Pricing
You only pay for compute time. When your app scales down, your bill decreases proportionally. Check our pricing page for details.