May 5, 2026

How Auto-Scaling Works on Stackryze

A deep dive into how Stackryze automatically scales your applications based on traffic, and how to configure scaling policies.

AK

Alex Kim

Stackryze Team

How Auto-Scaling Works on Stackryze

One of the most powerful features of Stackryze is automatic scaling. Your application scales up when traffic increases and scales down when it drops — all without manual intervention.

How It Works

Stackryze monitors three key metrics to make scaling decisions:

  1. CPU utilization — Target: 70% average across instances
  2. Memory usage — Target: 80% threshold
  3. Request queue depth — Target: < 100ms queue wait time

When any metric exceeds its threshold for 60 seconds, we spin up additional containers. When metrics drop, we gradually scale down with a 5-minute cooldown to prevent flapping.

Configuration

You can customize scaling behavior in your stackryze.yaml:

scaling:
  min_instances: 1
  max_instances: 20
  target_cpu: 70
  target_memory: 80
  cooldown_seconds: 300

Scale to Zero

On the Starter plan, applications scale to zero after 15 minutes of inactivity. On Pro and Enterprise, you can configure minimum instances to keep your app warm.

Pricing

You only pay for compute time. When your app scales down, your bill decreases proportionally. Check our pricing page for details.