Autoscale
This document describes the Unikraft Cloud Services Autoscale API (v1) for configuring and monitoring service autoscale.
Autoscale Basics
Services allow you to load balance traffic for an Internet-facing service like a webserver by creating multiple instances within the same service. While you can add or remove instances to a service to scale your service, doing this manually makes it hard to react to changes in service load. On the other hand, always keeping a large number of instances running to cope with bursts is not an option either. This is where autoscale comes into play. With autoscale enabled, Unikraft Cloud takes the heavy lifting of constantly monitoring the load of your service and automatically creates or deletes instances as needed.
To enable autoscale a typical workflow looks like this:
- Create a new service with the desired properties (e.g., published ports, DNS name).
- Create a new instance of your application and assign it to the service. This instance is going to be the autoscale master and cloned by Unikraft Cloud to scale your service.
- Create an autoscale configuration for the service and set the instance as master. The configuration allows you to define the metrics and policies based on which Unikraft Cloud performs autoscale. It also specifies the desired minimum and maximum number of instances as well as warmup and cooldown periods.
Warmup and Cooldown
When Unikraft Cloud decides to scale out your service it grants new instances a grace period in which they have time to complete boot, warm up caches and start having an effect on the load level. Only after this warmup phase, new instances are contributing to the evaluation of the autoscale metric. This is to let the effects of the new instances on the service stabilize and prevent extensive scale out. Note that new instances already receive traffic and serve load while they are still warming up.
Conversely, Unikraft Cloud uses a cooldown phase to control scale in. During this phase, instances selected for scale in are given a chance to drain existing connections while already being excluded from the number of active instances in the service. New connections or HTTP requests on existing connections1 are assigned to different instances. If there are still open connections after the cooldown phase, the remaining connections are forcefully closed.
1Only if the http connection handler has been set.
Autoscale Policies
With autoscale policies you define under which circumstances Unikraft Cloud should scale your service and what metric (e.g., CPU utilization) should be used for the decision.
Unikraft Cloud currently supports the following autoscale policies:
Policy | Description |
---|---|
Step | Defines concrete adjustments for selected value ranges in the metric |
An autoscale configuration usually comprises multiple policies, for example, to control scale in and scale out in separate policies. When Unikraft Cloud performs autoscale decisions it always evaluates all policies and does not stop at the first applicable policy. If none of the policies apply, Unikraft Cloud maintains the current number of instances.
Step Policy
A step policy consists of a set of steps that define
- a lower bound,
- an upper bound,
- and an adjustment.
The lower and upper bounds are always in the dimension of the selected metric.
A bound can be set to null
(or not provided at all) to make the step unbounded in the respective direction.
The interpretation of the adjustment depends on the how the step policy is configured.
Positive values increase the number of instances, negative values decrease the number of instances.
Adjustment type | Description |
---|---|
change | Relative change in the number of instances (e.g., +2 instances) |
exact | Absolute target number of instances in the service (e.g., 10 instances) |
percent | Change by percentage of the current number of instances in the service (e.g., +50%) |
An example step policy for scaling out may look like this:
If the CPU utilization per instance is below 500 millicores, no scaling action happens. If the CPU utilization is between 500 and 700 millicores, the policy instructs Unikraft Cloud to increase the number of instances by 50%. If the CPU utilization exceeds 700 millicores, the number of instances is doubled. Thus, if the per-instance CPU load is at 600 millicores (i.e., 60%) and the current number of instances in the service is 4, Unikraft Cloud will create 2 additional instances.
There are a set of rules that steps of the same policy must adhere to:
- The lower bound must be smaller than the upper bound
- The lower and upper bound cannot be
null
in the same step - Steps must not overlap
- Steps must be sorted in ascending order
- There must be no gaps between individual steps
Autoscale Metrics
You can base autoscale decisions on different metrics. Currently, Unikraft Cloud supports the following metrics:
Metric | Description |
---|---|
cpu | Per-instance CPU utilization measured in millicores (e.g., 100 millicores corresponds to 10% CPU utilization) |
API Endpoints
The Unikraft Cloud Services Autoscale API provides the following endpoints:
Method | Endpoint | Purpose and Description |
---|---|---|
POST | /v1/services/autoscale | Creates an autoscale configuration for one or more services |
GET | /v1/services/autoscale | Returns the current autoscale configuration of of services |
DELETE | /v1/services/autoscale | Deletes the autoscale configuration for the specified services |
POST | /v1/services/<UUID>/autoscale/policies | Adds one or more autoscale policies to the given service |
GET | /v1/services/<UUID>/autoscale/policies | Gets the configuration of existing autoscale policies |
DELETE | /v1/services/<UUID>/autoscale/policies | Deletes one or more autoscale policies from the given service |
In the following, the API endpoints are specified relative to this base URL:
With X
being the IATA metro code.
We use fra0
as an example in the documentation.
See the introduciton for more information on how to connect to the API.
Creating an Autoscale Configuration
Creates an autoscale configuration for the specified service.
Request
Endpoints:
POST /v1/services/autoscale
POST /v1/services/<UUID>/autoscale
Body Parameter | Type | Default | Required | Description |
---|---|---|---|---|
uuid | name 1,2 | UUID | Name | ✔️ | UUID or name of the service for which to create a configuration | |
min_size | int | 1 | Minimal number of instances | |
max_size | int | 4 | Maximum number of instances | |
warmup_time_ms | int | 1000 | Length of warmup phase in milliseconds | |
cooldown_time_ms | int | 1000 | Length of cooldown phase in milliseconds | |
master | object | ✔️ | ||
uuid | name 2 | UUID | Name | ✔️ | UUID or name of instance to use as autoscale master | |
policies | array of objects | Description of autoscale policies. See policy creation endpoint |
1 Not allowed in local scope.
2 You need to specify either uuid
or name
within the same body object.
Response
The response is embedded in a JSON object as described in API Responses.
Field | Type | Description |
---|---|---|
status | string | success on success, or error if the request failed |
uuid | UUID | UUID of the service |
name | Name | Name of the service |
Getting an Existing Autoscale Configuration
Returns the current autoscale configuration of a service.
Request
Endpoints:
GET /v1/services/autoscale
GET /v1/services/<UUID>/autoscale
Query Parameter | Type | Default | Required | Description |
---|---|---|---|---|
name 1 | string | list of strings | Names of services to get the autoscale configuration for as comma-separated list | ||
uuid 1 | string | list of strings | UUIDs of services to get the autoscale configuration for as comma-separated list |
Body Parameter | Type | Default | Required | Description |
---|---|---|---|---|
uuid | name 1,2 | UUID | Name | ✔️ | UUID or name of the service to get the autoscale configuration for |
1 Not allowed in local scope.
2 You need to specify either uuid
or name
within the same body object.
Response
The response is embedded in a JSON object as described in API Responses.
Field | Type | Description |
---|---|---|
status | string | success on success, unconfigured if autoscale is not configured, or error if the request failed |
uuid | UUID | UUID of the service |
name | Name | Name of the service |
enabled | bool | Whether autoscale is enabled |
min_size | int | Minimal number of instances |
max_size | int | Maximum number of instances |
warmup_time_ms | int | Length of warmup phase in milliseconds |
cooldown_time_ms | int | Length of cooldown phase in milliseconds |
master | object | |
uuid | UUID | UUID of autoscale master |
name | Name | Name of autoscale master |
policies | array of objects | Description of autoscale policies. See policy creation endpoint |
Deleting an Autoscale Configuration
Deletes the autoscale configuration for the specified service.
Unikraft Cloud will immediately drain all connections from all instances that have been created by autoscale and delete the instances afterwards.
The draining phase is allowed to take at most cooldown_time_ms
milliseconds after which remaining connections are forcefully closed.
The master instance is never deleted.
However, deleting the autoscale configuration causes the master instance to start if it is stopped.
Request
Endpoints:
DELETE /v1/services/autoscale
DELETE /v1/services/<UUID>/autoscale
Query Parameter | Type | Default | Required | Description |
---|---|---|---|---|
name 1 | string | list of strings | Names of services to delete the autoscale configuration for as comma-separated list | ||
uuid 1 | string | list of strings | UUIDs of services to delete the autoscale configuration for as comma-separated list |
Body Parameter | Type | Default | Required | Description |
---|---|---|---|---|
uuid | name 1,2 | UUID | Name | ✔️ | UUID or name of the service for which to delete the autoscale configuration |
1 Not allowed in local scope.
2 You need to specify either uuid
or name
within the same body object.
Response
The response is embedded in a JSON object as described in API Responses.
Field | Type | Description |
---|---|---|
status | string | success on success, or error if the request failed |
uuid | UUID | UUID of the service |
name | Name | Name of the service |
Adding an Autoscale Policy
Adds a new autoscale policy to the existing autoscale configuration of the specified service.
Request
Endpoints:
POST /v1/services/<UUID>/autoscale/policies
The available fields depend on the policy type. The following properties are common to all policies:
Body Parameter | Type | Default | Required | Description |
---|---|---|---|---|
name 1 | Name | ✔️ | Name of the policy | |
type | Policy | ✔️ | Type of autoscale policy |
1Policy names are subject to the same restrictions as object names in general (see here).
In addition, policy names cannot be longer than 31 characters.
Step Policy
Additional properties for step policies are:
Body Parameter | Type | Default | Required | Description |
---|---|---|---|---|
metric | Metric | cpu | Metric to monitor | |
adjustment_type | Adjustment Type | change | Type of adjustment specified in the steps | |
steps | array of objects | ✔️ | Steps of the step policy | |
lower_bound | int | ✔️2 | Lower bound of the step range. In dimension of selected metric | |
upper_bound | int | ✔️2 | Upper bound of the step range. In dimension of selected metric | |
adjustment | int | ✔️ | Adjustment to take if metric is in range |
2 Only one of lower_bound
and upper_bound
can be null
or not specified.
See the description of the step policy for more information on defining steps.
Response
The response is embedded in a JSON object as described in API Responses.
Field | Type | Description |
---|---|---|
status | string | success on success, or error if the request failed |
uuid | UUID | UUID of the service |
name | Name | Name of the service |
Getting the Configuration of an Autoscale Policy
Returns the configuration of the specified autoscale policy.
Request
Endpoints:
GET /v1/services/<UUID>/autoscale/policies
GET /v1/services/<UUID>/autoscale/policies/<NAME>
Query Parameter | Type | Default | Required | Description |
---|---|---|---|---|
name 1 | string | list of strings | Names of autoscale policies to return as comma-separated list |
Body Parameter | Type | Default | Required | Description |
---|---|---|---|---|
name 1 | Name | ✔️ | Name of the autoscale policy |
1 Not allowed in local scope.
Response
The response is embedded in a JSON object as described in API Responses.
The properties returned depend on the policy type. The following properties are common to all policies:
Field | Type | Description |
---|---|---|
status | string | success on success, or error if the request failed |
name | Name | Name of the policy |
type | Policy | Type of autoscale policy |
enabled | bool | Whether the autoscale policy is enabled |
Step Policy
Additional properties for step policies are:
Field | Type | Description |
---|---|---|
metric | Metric | Metric to monitor |
adjustment_type | Adjustment Type | Type of adjustment specified in the steps |
steps | array of objects | Steps of the step policy |
lower_bound | int | Lower bound of the step range. In dimension of selected metric |
upper_bound | int | Upper bound of the step range. In dimension of selected metric |
adjustment | int | Adjustment to take if metric is in range |
Deleting an Autoscale Policy
Deletes the specified autoscale policy.
Request
Endpoints:
DELETE /v1/services/<UUID>/autoscale/policies
DELETE /v1/services/<UUID>/autoscale/policies/<NAME>
Query Parameter | Type | Default | Required | Description |
---|---|---|---|---|
name 1 | string | list of strings | Names of autoscale policies to delete as comma-separated list |
Body Parameter | Type | Default | Required | Description |
---|---|---|---|---|
name 1 | Name | ✔️ | Name of the autoscale policy |
1 Not allowed in local scope.
Response
The response is embedded in a JSON object as described in API Responses.
Field | Type | Description |
---|---|---|
status | string | success on success, or error if the request failed |
name | Name | Name of the policy |