CSE 291 Virtualization , by Yiying Zhang , is one of my first courses that I have taken as a Master’s student at University of California, San Diego. We started from the basics of virtualization, progressed through different types and solutions, dove into containers, Kubernetes and serverless, and ended on a high note with cloud computing and its future.
If you have read my previous posts, I have been using virtualization solutions for quite some time, for both work and play. But I had never really sat down to think how would one virtualize a whole system and what kind of complexity that would bring. The naive way to think about it would be “dual/multi boot, but together”, but that certainly leaves a lot to be desired in terms of explanation. While I had certain ideologies in mind for virtualization, they were lacking where the rubber meets the road, how does it work with existing physical hardware.
This, being a research oriented course, involved extensive paper reading throughout the quarter (here is a list of them ). Each paper had a couple of questions on them, some of them open ended, allowing the reader to fully explore the space to come up with interesting solutions. Along with this, there was a course project with some very interesting topics to choose from. I am also happy to have worked on the Docker CLI, engine, containerd and runc codebase in bringing Live Container Migration support to Docker as a part of this course.
Right off the bat, we started with Virtual Machine Monitors (VMMs) , more commonly known as hypervisors, where we saw how a flow look like for an instruction that is being executed and ways to theoretical ways to virtualize them.
The next couple of papers went into more detail on how do we virtualize each component, i.e., the CPU , memory (RAM) and I/O (input/output devices). We see methods such as trap and emulate where we utilize the natural behaviour of the CPU to virtualize it, how not all architectures cannot be virtualized this way, and ways around it using binary translation. Similarly for memory, existing schemes can be slow to virtualize, depending on its implementation. We then go through Intel’s advancements in this space with Intel VT-x and EPT for CPU and memory virtualization. Here, we also see how VMWare rose to becoming the de facto standard for virtualization by implementing dynamic binary translation and efficient memory management to virtualize an x86 system without any modification.
For I/O devices, paravirtualization is a good case study with the simple yet efficient implementation of virtio along with hardware assisted SR-IOV . The elegant and resource independent implementation of virtio, while performant, needs support from the guest OS while SR-IOV fakes this in hardware.
Switching tracks to more modern use cases of virtualization, we look at an early view of cloud computing where conceptual and practical questions on market research and acceptance are laid out. Further papers on views of serverless computing and Microsoft’s look into the usage, implementation and optimization of serverless functions gets one familiarized with the motivation for serverless computing, which is a huge part of many cloud providers' offerings right now.
Not to stop at a commercial viewpoint, we take a look at works in academia with Pocket , a ephemeral storage solution built from the ground up for serverless scenarios, and SCAD , by Yiying et al., a unique resource centric view into cloud computing, where the cloud is split by the resources of the machines involved, to improve utilization.
Containers and Kubernetes
If my previous posts are anything to go by, you would guess right that I was looking forward for containers and Kubernetes . The container basics, including how a container uses Linux kernel features to achieve isolation was beautifully written. I was elated to have an opportunity to present the topic on Kubernetes to my class. It was an enjoyable experience to be able to create and deliver content to my peers while learning new concepts.
Virtualization being a vast topic, has many alternatives and implementations. We look at some of those in the next couple of papers.
The Google’s take at securing a container to provide as good of an isolation as virtual machines resulted in gVisor , which tries to mitigate the major attack surface of containers, the shared kernel, by emulating the kernel per container. While this adds overhead, it provides good security for the tradeoff to be worth it for certain applications.
Unikernels is effort in the other direction, to make virtualization as fast as containers, by compiling code to a virtual machine. They have invested significant effort in porting many drivers to the language of the Unikernel, OCaml, to make this happen. While the type safe unikernel has blazing performance at very low overheads, limiting the design space to OCaml made adoption of this intriguing technology very difficult.
Remember para-virtualization? Xen made it mainstream, with adoption from AWS. The paper . With modifications to the guest operating system, it was possible to run applications without modification, without support from hardware. Virtio is put to good use for the device drivers, and using of a secure domain 0 for administration made Xen a lean para-virtualization solution.
KVM and QEMU
KVM is a lightweight Linux kernel module that reuses many existing kernel features to make the host a type 1 hypervisor, sans I/O support which is left to the userspace. It uses the Linux view of processes and threads as virtual CPUs and the Linux scheduler and memory management system for the virtual machine as well. While hardware support was necessary for KVM, it supported multiple host architectures.
QEMU was the first one to be built which was then forked to form KVM. It takes more of a dynamic binary translation approach, very similar to that of VMWare, to virtualize or emulate any (of the supported) architecture on any host architecture. This was made possible by using an intermediate representation of micro operations to make adding a new architecture support easy. QEMU also provided hardware device emulation, which is absent from KVM. This made the combo of KVM+QEMU a very good full virtualization solution.
Bringing the serverless world from the container to VM world was done by AWS with Firecraker, a virtualization solution based on KVM. Instead of using QEMU and all the features deemed unnecessary in the serverless world, Firecracker provides a light weight virtualized solution with features available in containers, such as soft allocation and and fast switching while maintaining the security and isolation guarantees of a virtual machine. This allowed AWS to deploy 100s to 1000s of functions in a single machine while allowing secure multitenancy and simplistic design making it easier to manage.
Cloud computing, while an enticing approach, is not void of drawbacks. Security being the paramount issue, we look at the difficulties brought in by virtualization to ensure a secure environment and how oblivious cloud providers could inadvertantly enable data theft . Some interesting points here include abusing the placement and infrastructure policy of the cloud provider to exploit other vulnerabilities in the system such as side channel attacks.
In seemingly direct response to elevate the security posture of the cloud environment, AWS' work in their Nitro platform is something to be amazed at. Instead of allowing software, which may be vulnerable to malicious users, control the stack, they switch to offload everything to dedicated hardware, while leaving a thin KVM layer for hypervisor duties. Adding the Nitro TPM and secure computing chips allows a root of trust with significantly smaller threat surface. The benefits of this are not just higher security, but also allowing easier management and even live patching while user workloads are running, all while improving performance for the services provided. This is a significant movement by the AWS team in terms of thinking towards how virtualization can be completely offloaded to hardware.
Next generation Cloud
With Nitro bringing such changes to improve the traditional cloud, how would the future of cloud look like? In academia, User Defined Cloud , by Yiying et al., go through how the cloud providers become networked hardware vendors whose only job is to materialize a blueprint provided by the cloud user. This view allows the user to yield the power of deciding exactly what they need and how to achieve that is left to the cloud provider. The other view is Sky Computing , which leads on standardization of interfaces, akin to how the internet became a globally accepted standard for connecting individual networks together. Cloud providers become unified, with the user having the ability to pick and choose components for their application from any provider without having to worry about interoperability, which is handled by an intermediate layer. Making this more likely is the fact that there is no need for support from the cloud providers for this to be implemented, which makes me look forward to this goal being realized soon.
For anyone with a penchant for cloud technologies or an interest in the systems domain trying to understand virtualization in more detail, this course does an excellent job at covering the concepts from basics all the way to real world, practical and future state of the concepts involved. I had fun and learnt a lot in this course :)