KAPTAIN: E4’s Kubernetes Cluster
This article is about KAPTAIN, a high performance solution based on Kubernetes, for container management.
The complexity of deploying, planning, and load balancing an infrastructure grows exponentially every year. This phenomenon has led companies to find methods to organize infrastructure, development teams and processes, so that they are fast and effective.
It is in this context that containers are born.
But what are containers and what advantages do they bring to the infrastructure?
Containers allow to isolate applications by grouping all the libraries, dependencies and files necessary for their execution. In this way, each developer can work and move applications between the various environments, from production, to development, to testing without problems in terms of security and keeping all the functionality of the application intact.
Also, because they don’t include full operating systems, containers require minimal processing resources and are quick and easy to install. This efficiency allows to be deployed in clusters, with individual containers that contain individual components of complex applications. Thanks to this separation, developers can update application components individually, without the need to modify the whole application, ensuring efficient use of the infrastructure below.
For the optimal management of containers and applications, several platforms have emerged in recent years. Kubernetes is the most important and the one that has created a very large community of developers around it.
This tool, designed and developed in Google’s over 15 labs years ago and made opensource in 2014, is organized in clusters, that is a whole of nodes running containerized applications. Google itself contributes significantly to the Kubernetes opensource community, generating more than 2 billion container deployments per week and developing and carrying out all its activities in containers.
KAPTAIN: THE E4’S KUBERNETES CLUSTER
KAPTAIN is a bare-metal Kubernetes cluster, designed by HPC experts, is high-performance and ready-to-use, suitable for hosting the most demanding workloads in terms of compute and storage resources.
KAPTAIN is easy to use and intuitive: in fact, it is equipped with a graphic interface that simplifies the most complex operations. The end user accesses a rich catalog of open-source applications, to be installed with a simple click of the mouse.
KAPTAIN was designed to be easily integrated with the company’s data center services. It can use pre-existing authentication systems (AD, LDAP, openID) and allocate persistent volumes for containers on the most popular network storage technologies (iSCSI, CephRDB, NFS, GlusterFS,…).
Furthermore, it can host large-scale data analysis platforms such as Apache Spark, KubeFlow, Dask and Ray, thus giving the possibility to have a powerful and complete system for Cloud Native Data Science.
What are the main features of KAPTAIN?
Kaptain is a turnkey Kubernetes cluster, designed to ensure ease of use and high performance: its initial configuration integrates, for example, a distributed block storage service for container data persistency and a powerful web interface which, among the other things, gives access to an articulated catalog of open-source applications, ready to use.
IT’S A HIGH-PERFORMANCE PLATFORM: Includes the typical components of HPC infrastructures (all-flash disks, GPU and RDMA network), to run the most data-intensive workloads.
IT IS EQUIPPED WITH A SIMPLE AND POWERFUL UI: The Rancher server distribution allows for powerful and easy-to-use user interfaces.
IT HAS A PRECONFIGURED NATIVE CLOUD STORAGE: It is possible to use both an external storage resource (NFS, Gluster, Ceph ..), and a distributed block storage solution, for the implementation, in self-provisioning, of persistent volumes to “match” to containerized workloads.
How is KAPTAIN structured?
A Kubernetes cluster is a collection of servers configured to run containerized applications and services; the core of the architecture is the so-called ControlPlane (the set of API Server, Controller Manager, Scheduler and Etcd services) which is responsible for the management and orchestration of containers and provides the interface (API) to the outside of the cluster . The simplest configuration of a Kubernetes infrastructure provides a Master Node, which hosts the ControlPlane, and a set of Worker Nodes, dedicated to the execution of containerized user workloads. The infrastructure servers can be interconnected both through an Internal Network, dedicated to communication between the ControlPlane and the Worker Nodes, and through an External Network, dedicated to accessing applications and services running on the Worker Nodes.