Setting up Scale to 0
With conventional cloud platforms you need to keep at least one instance running at all times to be able to respond to incoming requests: performing a just-in-time cold boot is simply too time-consuming and would create a response latency of multiple seconds or worse.
This is not the case with Unikraft Cloud (UKC). Based on extremely lightweight unikernel technology, instances on UKC are able to cold boot within milliseconds, while providing the same strong, hardware-level isolation afforded by virtual machines.
Millisecond cold boots allow us to perform low-latency scale-to-zero: that is, as long as no traffic is flowing through your instance, it consumes no resources. When the next connection arrives, Unikraft Cloud takes care of transparently cold booting (can it be called cold booting if it’s milliseconds?) your instance and replying — all of that within a negligible amount of time with respect to Internet RTTs and so unbeknownst to your end users.
By default, Unikraft Cloud reduces network and cloud stack cold start time to a minimum. If you need to deploy an app whose initialization takes a while to finish (e.g., Spring Boot, Puppeteer, etc) and would still like to retain millisecond cold starts, Unikraft Cloud provides a stateful feature to deal with this; please check out this guide for more information on how to set this up.
Setting it Up
We really like millisecond scale to zero 😀 and so for the most part we have it on in all of our examples.
We do so via a label in each of the subdirectories’ Kraftfile
:
Since UKC has scale to zero on by default, all you need to do is to start an instance normally:
This command will create the NGINX instance with scale to zero enabled:
Note that at first the status is listed as running
in the output of the kraft cloud deploy
command.
Let’s check the instance’s status:
You should see output similar to:
Notice the state is now set to standby
?
At first kraft cloud deploy
sets the state to running
, but then UKC puts the instance immediately to sleep (more accurately, it stopped it, but it keeps state to start it again when needed).
You can also check that scale to 0 is enabled through the kraft cloud scale
command:
which outputs:
Note the min size
(0) and max size
(1) fields — these mean that the service can scale from max 1 instance to min 0 instances, meaning that scale to 0 is enabled.
Testing Scale to 0
Now let’s take this out for a spin.
Try using curl
or your browser to see scale to 0 (well, scale to 1 in this case!) in action:
You should get an NGINX response with no noticeable delay.
For fun, try to use the following command to see if you can catch the instance’s STATE
field changing from standby
to running
If you curl
enough, you should see the STATE
turn to a green running
from time to time:
Idle Mode
Unikraft Cloud supports an additional scale to zero mode called idle mode. This mode allows apps to still be scaled down to zero even though they may have long running, established (but idle) TCP connections. In this mode, when this is the case, UKC will (1) scale the app to zero and (2) ensure that the TCP connection remains established until the app wakes back up.
You can enable this mode through the --scale-to-zero=idle
flag when deploying your app, and you can find more information about this mode in the API documentation.
Learn More
- The
kraft cloud
CLI reference, and in particular the services and scale sub-commands - Unikraft Cloud’s REST API reference, and in particular the section on scale to zero.