Hi, we have tried BETA of Autoscaling feature and we have some thoughts how to make it better. In its current setup its not really suitable for our production workloads. Here are some thoughts how to make it better:
1. Define separate scaling steps
At them moment the scaling step is always 1. Going from M10 -> M20 which is not really suitable for burst loads where going one step up might not be enough. Same goes for rapid scaling down
Example:
Scale range = M10 - M50
Scale step up 4 = (M10 -> M50)
Scale step down 2 = (M50 -> M30 -> M10)
2. Define custom timescale
It seems that current setup is to start scaling down after 72h, and then repeat every 24h.
Our system can be scaled down much more rapidly, when our burst load goes away its done for a few days, we know we can start scaling down after 12h and repeat every 6h.
With current setup it will take 6 days to scale down from M50 -> M10.
3. Define custom scaling metrics and thresholds
It seems that current system is not taking N of connected clients as a scaling metric.
- when connecting from cloud functions its easy to have a lot of connections which are not draining CPU or memory, when we are scaled down to M10 that limit is only 200
- additionally we would like to scale up when our CPU limit is >50% which is not possible ATM
Nice to have:
4. Time based scaling events
scale up/down at specified time & day, useful for scaling up DEV/Research environments within working hours
PS: writing a long post in this form is terrible