Autoscale

This document describes the Unikraft Cloud Services Autoscale API (v1) for configuring and monitoring service autoscale.

Autoscale Basics

Services allow you to load balance traffic for an Internet-facing service like a webserver by creating multiple instances within the same service. While you can add or remove instances to a service to scale your service, doing this manually makes it hard to react to changes in service load. On the other hand, always keeping a large number of instances running to cope with bursts is not an option either. This is where autoscale comes into play. With autoscale enabled, Unikraft Cloud takes the heavy lifting of constantly monitoring the load of your service and automatically creates or deletes instances as needed.

To enable autoscale a typical workflow looks like this:

Create a new service with the desired properties (e.g., published ports, DNS name).
Create a new instance of your application and assign it to the service. This instance is going to be the autoscale master and cloned by Unikraft Cloud to scale your service.
Create an autoscale configuration for the service and set the instance as master. The configuration allows you to define the metrics and policies based on which Unikraft Cloud performs autoscale. It also specifies the desired minimum and maximum number of instances as well as warmup and cooldown periods.

Warmup and Cooldown

When Unikraft Cloud decides to scale out your service it grants new instances a grace period in which they have time to complete boot, warm up caches and start having an effect on the load level. Only after this warmup phase, new instances are contributing to the evaluation of the autoscale metric. This is to let the effects of the new instances on the service stabilize and prevent extensive scale out. Note that new instances already receive traffic and serve load while they are still warming up.

Conversely, Unikraft Cloud uses a cooldown phase to control scale in. During this phase, instances selected for scale in are given a chance to drain existing connections while already being excluded from the number of active instances in the service. New connections or HTTP requests on existing connections¹ are assigned to different instances. If there are still open connections after the cooldown phase, the remaining connections are forcefully closed.

¹Only if the http connection handler has been set.

Autoscale Policies

With autoscale policies you define under which circumstances Unikraft Cloud should scale your service and what metric (e.g., CPU utilization) should be used for the decision.

Unikraft Cloud currently supports the following autoscale policies:

Policy	Description
Step	Defines concrete adjustments for selected value ranges in the metric

An autoscale configuration usually comprises multiple policies, for example, to control scale in and scale out in separate policies. When Unikraft Cloud performs autoscale decisions it always evaluates all policies and does not stop at the first applicable policy. If none of the policies apply, Unikraft Cloud maintains the current number of instances.

Step Policy

A step policy consists of a set of steps that define

a lower bound,
an upper bound,
and an adjustment.

The lower and upper bounds are always in the dimension of the selected metric. A bound can be set to null (or not provided at all) to make the step unbounded in the respective direction. The interpretation of the adjustment depends on the how the step policy is configured. Positive values increase the number of instances, negative values decrease the number of instances.

Adjustment type	Description
`change`	Relative change in the number of instances (e.g., +2 instances)
`exact`	Absolute target number of instances in the service (e.g., 10 instances)
`percent`	Change by percentage of the current number of instances in the service (e.g., +50%)

An example step policy for scaling out may look like this:

{
  "name": "scale-out-policy",
  "type": "step",
  "metric": "cpu",
  "adjustment_type": "percent",
  "steps": [
    { "lower_bound": 500, "upper_bound": 700,  "adjustment": 50  },
    { "lower_bound": 700, "upper_bound": null, "adjustment": 100 }
  ]
}

If the CPU utilization per instance is below 500 millicores, no scaling action happens. If the CPU utilization is between 500 and 700 millicores, the policy instructs Unikraft Cloud to increase the number of instances by 50%. If the CPU utilization exceeds 700 millicores, the number of instances is doubled. Thus, if the per-instance CPU load is at 600 millicores (i.e., 60%) and the current number of instances in the service is 4, Unikraft Cloud will create 2 additional instances.

There are a set of rules that steps of the same policy must adhere to:

The lower bound must be smaller than the upper bound
The lower and upper bound cannot be null in the same step
Steps must not overlap
Steps must be sorted in ascending order
There must be no gaps between individual steps

Autoscale Metrics

You can base autoscale decisions on different metrics. Currently, Unikraft Cloud supports the following metrics:

Metric	Description
`cpu`	Per-instance CPU utilization measured in millicores (e.g., 100 millicores corresponds to 10% CPU utilization)

API Endpoints

The Unikraft Cloud Services Autoscale API provides the following endpoints:

Method	Endpoint	Purpose and Description
`POST`	`/v1/services/autoscale`	Creates an autoscale configuration for one or more services
`GET`	`/v1/services/autoscale`	Returns the current autoscale configuration of of services
`DELETE`	`/v1/services/autoscale`	Deletes the autoscale configuration for the specified services
`POST`	`/v1/services/<UUID>/autoscale/policies`	Adds one or more autoscale policies to the given service
`GET`	`/v1/services/<UUID>/autoscale/policies`	Gets the configuration of existing autoscale policies
`DELETE`	`/v1/services/<UUID>/autoscale/policies`	Deletes one or more autoscale policies from the given service

In the following, the API endpoints are specified relative to this base URL:

https://api.X.kraft.cloud/

With X being the IATA metro code. We use fra0 as an example in the documentation. See the introduciton for more information on how to connect to the API.

Creating an Autoscale Configuration

Creates an autoscale configuration for the specified service.

Request

Endpoints:
POST /v1/services/autoscale
POST /v1/services/<UUID>/autoscale

Body Parameter	Type	Default	Required	Description
`uuid` \| `name`^1,2	UUID \| Name		✔️	UUID or name of the service for which to create a configuration
`min_size`	int	1		Minimal number of instances
`max_size`	int	4		Maximum number of instances
`warmup_time_ms`	int	1000		Length of warmup phase in milliseconds
`cooldown_time_ms`	int	1000		Length of cooldown phase in milliseconds
`master`	object		✔️
`uuid` \| `name`²	UUID \| Name		✔️	UUID or name of instance to use as autoscale master
`policies`	array of objects			Description of autoscale policies. See policy creation endpoint

¹ Not allowed in local scope.
² You need to specify either uuid or name within the same body object.

curl -X POST \
     -H "Authorization: Bearer ${UKC_TOKEN}" \
     -H "Content-Type: application/json" \
     "https://api.fra0.kraft.cloud/v1/services/autoscale" \
     -d '{
        "name": "my-service",
        "min_size": 4,
        "max_size": 4,
        "master": {
          "name": "my-instance"
        }
     }'

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`uuid`	UUID	UUID of the service
`name`	Name	Name of the service

{
  "status": "success",
  "data": {
    "service_groups": [
      {
        "status": "success",
        "uuid": "3b5b4c36-2c9b-46e4-80c6-7e5b561938c2",
        "name": "my-service"
      }
    ]
  }
}

Getting an Existing Autoscale Configuration

Returns the current autoscale configuration of a service.

Request

Endpoints:
GET /v1/services/autoscale
GET /v1/services/<UUID>/autoscale

Query Parameter	Type	Default	Required	Description
`name`¹	string \| list of strings			Names of services to get the autoscale configuration for as comma-separated list
`uuid`¹	string \| list of strings			UUIDs of services to get the autoscale configuration for as comma-separated list

Body Parameter	Type	Default	Required	Description
`uuid` \| `name`^1,2	UUID \| Name		✔️	UUID or name of the service to get the autoscale configuration for

¹ Not allowed in local scope.
² You need to specify either uuid or name within the same body object.

curl -X GET \
     -H "Authorization: Bearer ${UKC_TOKEN}" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale"

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, `unconfigured` if autoscale is not configured, or `error` if the request failed
`uuid`	UUID	UUID of the service
`name`	Name	Name of the service
`enabled`	bool	Whether autoscale is enabled
`min_size`	int	Minimal number of instances
`max_size`	int	Maximum number of instances
`warmup_time_ms`	int	Length of warmup phase in milliseconds
`cooldown_time_ms`	int	Length of cooldown phase in milliseconds
`master`	object
`uuid`	UUID	UUID of autoscale master
`name`	Name	Name of autoscale master
`policies`	array of objects	Description of autoscale policies. See policy creation endpoint

{
  "status": "success",
  "data": {
    "service_groups": [
      {
        "status": "success",
        "uuid": "3b5b4c36-2c9b-46e4-80c6-7e5b561938c2",
        "name": "my-service",
        "enabled": true,
        "min_size": 0,
        "max_size": 1,
        "warmup_time_ms": 500,
        "cooldown_time_ms": 500,
        "master": {
          "uuid": "77d0316a-fbbe-488d-8618-5bf7a612477a",
          "name": "my-instance"
        },
        "policies": []
      }
    ]
  }
}

{
  "status": "success",
  "data": {
    "service_groups": [
      {
        "status": "unconfigured",
        "uuid": "3b5b4c36-2c9b-46e4-80c6-7e5b561938c2",
        "name": "my-service"
      }
    ]
  }
}

Deleting an Autoscale Configuration

Deletes the autoscale configuration for the specified service. Unikraft Cloud will immediately drain all connections from all instances that have been created by autoscale and delete the instances afterwards. The draining phase is allowed to take at most cooldown_time_ms milliseconds after which remaining connections are forcefully closed. The master instance is never deleted. However, deleting the autoscale configuration causes the master instance to start if it is stopped.

Request

Endpoints:
DELETE /v1/services/autoscale
DELETE /v1/services/<UUID>/autoscale

Query Parameter	Type	Default	Required	Description
`name`¹	string \| list of strings			Names of services to delete the autoscale configuration for as comma-separated list
`uuid`¹	string \| list of strings			UUIDs of services to delete the autoscale configuration for as comma-separated list

Body Parameter	Type	Default	Required	Description
`uuid` \| `name`^1,2	UUID \| Name		✔️	UUID or name of the service for which to delete the autoscale configuration

¹ Not allowed in local scope.
² You need to specify either uuid or name within the same body object.

curl -X DELETE \
     -H "Authorization: Bearer ${UKC_TOKEN}" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale"

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`uuid`	UUID	UUID of the service
`name`	Name	Name of the service

{
  "status": "success",
  "data": {
    "service_groups": [
      {
        "status": "success",
        "uuid": "3b5b4c36-2c9b-46e4-80c6-7e5b561938c2",
        "name": "my-service"
      }
    ]
  }
}

Adding an Autoscale Policy

Adds a new autoscale policy to the existing autoscale configuration of the specified service.

Request

Endpoints:
POST /v1/services/<UUID>/autoscale/policies

The available fields depend on the policy type. The following properties are common to all policies:

Body Parameter	Type	Default	Required	Description
`name`¹	Name		✔️	Name of the policy
`type`	Policy		✔️	Type of autoscale policy

¹Policy names are subject to the same restrictions as object names in general (see here). In addition, policy names cannot be longer than 31 characters.

Step Policy

Additional properties for step policies are:

Body Parameter	Type	Default	Required	Description
`metric`	Metric	`cpu`		Metric to monitor
`adjustment_type`	Adjustment Type	`change`		Type of adjustment specified in the steps
`steps`	array of objects		✔️	Steps of the step policy
`lower_bound`	int		✔️²	Lower bound of the step range. In dimension of selected metric
`upper_bound`	int		✔️²	Upper bound of the step range. In dimension of selected metric
`adjustment`	int		✔️	Adjustment to take if metric is in range

² Only one of lower_bound and upper_bound can be null or not specified. See the description of the step policy for more information on defining steps.

curl -X POST \
     -H "Authorization: Bearer ${UKC_TOKEN}" \
     -H "Content-Type: application/json" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale/policies" \
     -d '[
        {
          "name": "scale-out-policy",
          "type": "step",
          "metric": "cpu",
          "adjustment_type": "percent",
          "steps": [
            { "lower_bound": 500, "upper_bound": 700,  "adjustment": 50  },
            { "lower_bound": 700, "upper_bound": null, "adjustment": 100 }
          ]
        },
        {
          "name": "scale-in-policy",
          "type": "step",
          "metric": "cpu",
          "adjustment_type": "percent",
          "steps": [
            { "lower_bound": null, "upper_bound": 40, "adjustment": -20 },
            { "lower_bound": 40,   "upper_bound": 50, "adjustment": -10 }
          ]
        }
      ]'

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`uuid`	UUID	UUID of the service
`name`	Name	Name of the service

{
  "status": "success",
  "data": {
    "policies": [
      {
        "status": "success",
        "name": "scale-out-policy"
      },
      {
        "status": "success",
        "name": "scale-in-policy"
      }
    ]
  }
}

Getting the Configuration of an Autoscale Policy

Returns the configuration of the specified autoscale policy.

Request

Endpoints:
GET /v1/services/<UUID>/autoscale/policies
GET /v1/services/<UUID>/autoscale/policies/<NAME>

Query Parameter	Type	Default	Required	Description
`name`¹	string \| list of strings			Names of autoscale policies to return as comma-separated list

Body Parameter	Type	Default	Required	Description
`name`¹	Name		✔️	Name of the autoscale policy

¹ Not allowed in local scope.

curl -X GET \
     -H "Authorization: Bearer ${UKC_TOKEN}" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale/policies/scale-out-policy"

Response

The response is embedded in a JSON object as described in API Responses.

The properties returned depend on the policy type. The following properties are common to all policies:

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`name`	Name	Name of the policy
`type`	Policy	Type of autoscale policy
`enabled`	bool	Whether the autoscale policy is enabled

Step Policy

Additional properties for step policies are:

Field	Type	Description
`metric`	Metric	Metric to monitor
`adjustment_type`	Adjustment Type	Type of adjustment specified in the steps
`steps`	array of objects	Steps of the step policy
`lower_bound`	int	Lower bound of the step range. In dimension of selected metric
`upper_bound`	int	Upper bound of the step range. In dimension of selected metric
`adjustment`	int	Adjustment to take if metric is in range

{
  "status": "success",
  "data": {
    "policies": [
      {
        "status": "success",
        "name": "scale-out-policy",
        "type": "step",
        "enabled": true,
        "metric": "cpu",
        "adjustment_type": "percent",
        "steps": [
          { "lower_bound": 500, "upper_bound": 700,  "adjustment": 50  },
          { "lower_bound": 700, "upper_bound": null, "adjustment": 100 }
        ]
      }
    ]
  }
}

Deleting an Autoscale Policy

Deletes the specified autoscale policy.

Request

Endpoints:
DELETE /v1/services/<UUID>/autoscale/policies
DELETE /v1/services/<UUID>/autoscale/policies/<NAME>

Query Parameter	Type	Default	Required	Description
`name`¹	string \| list of strings			Names of autoscale policies to delete as comma-separated list

Body Parameter	Type	Default	Required	Description
`name`¹	Name		✔️	Name of the autoscale policy

¹ Not allowed in local scope.

curl -X DELETE \
     -H "Authorization: Bearer ${UKC_TOKEN}" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale/policies/scale-out-policy"

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`name`	Name	Name of the policy

{
  "status": "success",
  "data": {
    "policies": [
      {
        "status": "success",
        "name": "scale-out-policy"
      }
    ]
  }
}