article

The Cloud Engineer's Dilemma (Part 2)

Cold starts, slow autoscale, subpar scale to zero, severe over-provisioning and exploding bills. Must the cloud really be this complex? Part 2 of a two-part series.

· 6 min read

In our previous post (part 1) we discussed how difficult it is nowadays to be a cloud developer or engineer – having to jump through a large number of cloud infra loops in order to have an application or service reliably and efficiently deployed.

At Unikraft, we strongly believe that cloud infra issues such as cold stars, over-provisioning, slow autoscale and exploding cloud bills, to name a few, should not be “features” of the cloud going forward. In fact, we’ll argue that none of these are fundamental limitations, but rather the emerging properties of a cloud architecture based on layers upon layers of accreted (often non-cloud native) software. With that preamble, let’s get into what’s going on and how to fix it!

Back to First (Cloud) Principles

What are the fundamental pieces we actually need to run on the cloud? Clearly a server of course (and yes, even serverless uses servers 🙂), and then a hypervisor to provide strong, hardware-level isolation for multi-tenancy purposes. Beyond this, the only piece we really care about is the actual app running inside a virtual machine (a virtual machine because that’s the model required by the hypervisor). So ideally something lean and efficient like so:

HypervisorApplicationVMBare metal server.Multi-tenancy provided byhardware-level Isolation.The only part in a clouddeployment we care about.

As we showed in part 1, however, cloud stacks are much more convoluted and inefficient than this. What we would like is to be able to create a special-purpose virtual machine (think: operating system, libraries, system services, etc) and distro to match the needs, and only the needs, of each app being deployed. Sounds impossible (or at least very hard)? Well, not impossible, certainly not trivial, but thankfully now ready and available – read on!

Enter Unikernels

So we need specialized virtual machines, and that’s just what a “unikernel” is. Because we need the VM to be as efficient/specialized as possible, unikernels are typically built from library operating systems (read: modular OSes) that make it easy to pick and choose the functionality that’s needed (e.g., a network stack, a scheduler, a libc library, etc); contrast that to monolithic OSes such as Linux or Windows, where, by and large, doing this level of customization requires significant amounts of hacking. To illustrate:

APPLIBRARIESOS/KERNELSPECIALIZED VMUNIKERNEL

A unikernel essentially takes the parts/components that are needed by a particular app, and constructs, for each app, a custom OS and distro for it: if the app needs it it makes it, if it doesn’t it never gets deployed.

I mentioned earlier that this was non-trivial, and it hasn’t been: Over 5 years ago we created the Linux Foundation Unikraft OSS project to make it easy to build such unikernels, based on three key principles:

  1. Fully Modular for maximum efficiency, performance and specialization
  2. POSIX/Linux API Compatibility for running unmodified apps and languages
  3. Transparent Tooling Integration so developers can use existing tools

I won’t go into the details of Unikraft (see here for gory details), but suffice it to say that it took the better part of 4 years to get it to a point where it could run a very large set of standard applications, and to do so while providing easy to use tooling.

To give you a ballpark idea of what this achieves, unikernels can reduce start times, memory consumption, throughput and LoC reduction by several multipliers or even orders of magnitude (think cold start times of a few milliseconds).

Putting it all Together: Unikraft Cloud, A Millisecond, Reactive Platform

It’s great to have really fast efficient images, but this isn’t enough if our goal is to build fast, reactive, millisecond cloud infra. In fact, if you were to deploy a very lean say NGINX Unikraft unikernel as an AMI on EC2, it would start in….minutes! This is because the image itself is only one part of the equation: underneath the (lean) image sits the rest of the cloud stack (remember the iceberg picture in part 1?), not to mention the controller and then networking components such as load balancers.

In order to provide end-to-end millisecond semantics, such that when you send a request to a fully off app, it wakes up and responds within a few milliseconds, we’ve had to go back to the drawing board and design and implement a load balancer and controller able to react in milliseconds and scale to thousands (and even 10s of thousands) of instances.

We call the result Unikraft Cloud, a truly serverless (no infra headaches), millisecond, reactive cloud platform. How do all the components of Unikraft Cloud platform work in conjunction? When a request comes in, the platform’s proxy buffers it, and notifies the controller that it did. The controller will look up whether an instance exists to service the request, and, if it does, will ask Firecracker to wake the instance (read: unikernel) up.

InstanceVMMloadCONTROLLERidwakeUserrequestPROXYBufferrequestInstancestateWithin TCP RTT (Milliseconds)

When ready, the controller will notify the proxy, which will then unbuffer the request so that the unikernel can receive it and reply to it. On Unikraft Cloud that entire chain of events happens within milliseconds, and so hidden within the Internet’s typical RTTs. As far as the end user is concerned the instance was always up; as far as the platform is concerned, it was on standby, consuming no resources. Similar mechanisms allow Unikraft Cloud to offer millisecond autoscale and cold starts.

InstanceVMMCONTROLLERUserPROXYrequestreplyunbufferreadyreadyWithin TCP RTT (Milliseconds)

A Disruptive Platform with a non-Disruptive Dev Experience

As a design principle, it was crucial to us that, as much as possible, Unikraft Cloud users only notice that there’s something different under the hood because performance is noticeably better and cloud infra issues disappear. To this end, we spend a lot of effort integrating with existing tooling such as Docker so that we don’t disrupt the developer experience. I won’t go into a lot of details as to how the platform works in practice (for that you can check out the docs), but as a quick preview, it’s as easy as installing our open-source CLI tool:

Terminal window
# Install on macOS, Linux and Windows
curl -sSfL https://get.kraftkit.sh | sh

and then running a one-liner to deploy your first app:

Terminal window
kraft cloud --metro fra0 deploy -p 443:8080 ./project
Terminal window
[+] Building rootfs via Dockerfile... done!
[+] Packaging... done!
[+] Deploying... done!
[] Deployed successfully!
────────── name: project-6cfc4
────────── uuid: 62d1d6e9-0d45-4ced-ad2a-619718ba0344
───────── state: running
─────────── url: https://long-violet-92ka3gk7.fra0.kraft.host
───────── image: project@sha256:fb3e5fb1609ab4fd40d38ae12605d56fc0dc48aaa0ad4890ed7ba0b637af69f6
───── boot time: 16.65 ms
──────── memory: 128 MiB
service group: long-violet-92ka3gk7
── private fqdn: project-6cfc4.internal
──── private ip: 172.16.6.4

So based on Dockerfiles, with a 16ms cold start in this case.

Take it Out for a Spin!

To wrap things up, here’s a short list of Unikraft Cloud facts/features:

  • Cold starts in milliseconds, even for heavy apps vs. seconds or minutes on other plats;
  • True (optionally stateful) scale to zero: requests (and their absence) are detected in milliseconds vs. seconds or minutes on other plats;
  • Reactive autoscale, in milliseconds, vs seconds or minutes on other plats;
  • High server density: thousands of instances on a single server.

Get early access to the Unikraft Cloud

Unikraft Cloud has free early access, so if you find it intriguing and would like to get a taste of what a next generation compute cloud platform looks like:

Sign-up now