E4 Container Platform
Kubernetes High Performance Cluster
Containers enable you to develop and deploy complex applications faster and, most importantly, ensure efficient use of the underlying infrastructure: a unique combination of benefits
Why perform operations in manual mode with the certainty that something will go wrong?
When it comes to running multi-container applications in a production environment, things get complicated: hundreds or thousands of containers and different types of applications. Kaptain, the Kubernetes-based E4 Container Platform, is the conductor that serves your infrastructure
“A sea captain is a sailor with a high-level license who has the command and final responsibility over a ship. The captain is responsible for the safe and efficient operation of the ship and of the people and load on board” Wikipedia – Sea Captain
FAST Kaptain is a bare-metal Kubernetes cluster designed by HPC experts to provide containers with a high-performance environment. All-flash drives, low-latency networking (RDMA) and GPU support are the key to ensuring containerized workloads both fast data access and the shortest possible processing times
INTUITIVE Kaptain is equipped with a powerful graphic interface that simplifies the most complex operations. The end user accesses a rich catalog of open-source applications, to be installed with a simple mouse click. With Kaptain you no longer need a PhD in “Kubernetes-ology” to be a productive DevOps
INTEGRATED The E4 Container Platform is designed to be easily integrated with your data center services. Kaptain can use pre-existing authentication systems (AD, LDAP, openID) and allocate persistent volumes for containers on the most popular network storage technologies (iSCSI, CephRDB, NFS, GlusterFS, …)
MODERN Kaptain is a high-performance infrastructure designed to host large-scale data analysis platforms such as Apache Spark, KubeFlow, Dask and Ray. With the integration of MinIO (high performance S3 storage) and a workflow manager (AirFlow, Prefect,…), you will get a powerful and complete system for Cloud Native Data Science
A Kubernetes cluster is a collection of servers configured to run containerized applications and services; the core of the architecture is the so-called ControlPlane (the set of API Server, Controller Manager, Scheduler and Etcd services) which is responsible for the management and orchestration of containers and provides the interface (API) to the outside of the cluster . The simplest configuration of a Kubernetes infrastructure provides a Master Node, which hosts the ControlPlane, and a set of Worker Nodes, dedicated to the execution of containerized user workloads. The infrastructure servers can be interconnected both through an Internal Network, dedicated to communication between the ControlPlane and the Worker Nodes, and through an External Network, dedicated to accessing applications and services running on the Worker Nodes.
Compared to the basic setup of Kubernetes, the standard version of Kaptain provides for the high-reliability configuration of the ControlPlane, integrates a high-performance distributed block storage service necessary to provide data persistence to the containers and an additional high-performance Network (RDMA – high bandwidth and low latency), on which the Kubernetes Software Defined Network is configured. The high reliability towards the inside of the Cluster is obtained through the “replication” of the ControlPlane services on 3 distinct Master Nodes; HA-Proxy and Keepalived are also configured on these servers, so that access to the Kubernetes API is balanced and highly reliable, even from outside the Cluster. The distributed block storage functionality is integrated through the deployment and configuration of Longhorn, a cloud-native distributed block storage solution developed by Rancher and supported by the Cloud Native Computing Foundation. All Kaptain’s Worker Nodes (or parts of them) are equipped with a set of dedicated all-flash disks; Longhorn services aggregate these disks into the Cluster’s default StorageClass, to which user workloads can request Persistent Volumes to “mount” on running containers. Longhorn uses the replication volume on multiple Worker Nodes to ensure redundancy on different physical systems. Longhorn provides a configurable default replication level, which can be modified by the user when creating the single volume, according to specific needs. Naturally, Kaptain uses the high-performance internal network both for accessing Persistent Volumes and for the replication operations connected to them. Upon request, cloud native distributed block storage capabilities can also be implemented through Rook-Ceph.
Kaptain is available in 3 configurations: Hyper-Convergent, Convergent and Distributed.
The Hyper Convergent configuration includes a total of 3 servers, each plays both the role of Master Node and Worker Node of the Cluster. In this configuration, each of these 3 servers also hosts the distributed block storage services, that is, it also plays the role of Storage Node. It is the minimum configuration able to guarantee high reliability to the infrastructure.
The Convergent Configuration provides 3 or 5 servers dedicated to the role of Master Node and the rest that simultaneously perform the role of Worker Node and Storage Node. It is the ideal configuration for organizations that expect a simultaneous growth in the computing and storage capacities they need over time.
The Distributed Configuration provides 3 or 5 servers dedicated to the Master Node role and the remaining servers that are dedicated to the Worker Node role or the Storage Node role. This is the maximum performance configuration, because, in addition to providing servers dedicated to the ControlPlane, it allows you to implement the native distributed cloud block storage on the number of Storage Nodes, ideal for optimizing performance according to the net disk space required and to configure the Worker Nodes remaining solely based on the requirements of user workloads and related horizontal scaling capabilities.
Kaptain’s unique features – what sets it apart from the competition!
Kaptain is a ready-to-use, high-performance Kubernetes cluster, designed to host the most demanding workloads in terms of compute and storage resources and configured to be easy to use.
High Performance Platform: E4 means “When Performance Matters” and this motto haunts our engineers so much that, when they designed Kaptain, they mainly had in mind the most expensive containerized workloads in terms of compute and storage resources. Hence a design that includes typical components of HPC infrastructures (all-flash disks, GPUs and RDMA networking) to obtain a solution capable of hosting, for example, the most expensive distributed platforms for Big Data analysis in terms of resources, granting the end user the multi GPU support for the workload it uses and an environment where access to the data to be processed never represents the bottleneck.
Easy & powerful UI: Kaptain is easy to use especially thanks to the integration of Rancher Server, a service configured in high reliability that provides a powerful graphic interface accessible from the web, suitable for both the administrator, who, thanks to the Cluster Manager component, has of a tool that simplifies the more complex operations necessary for the management of the infrastructure, and the end user, who can use the Cluster Explorer to manage its own containerized workloads and access a rich catalog of open-source applications, to be installed with a simple mouse click.
Preconfigured Cloud Native Storage: in addition to the possibility of using external storage resources (NFS, Gluster, Ceph, ..), in each configuration Kaptain hosts a distributed block storage solution, for the deployment, in self provisioning, of Persistent Volumes to “link” with containerized workloads. Thanks to the use of all-flash technologies for the back-end disks, the use of the internal low-latency and high-bandwidth network for access and replication operations, the default Storage Class allows you to configure Volumes that guarantee reliability and performance in accessing persistent data.
Cloud Native Data Science Infrastructure: Kaptain was designed to provide the customer with a high-performance infrastructure to host the most modern distributed data analysis platforms. Choosing Kaptain as an enabling infrastructure for Big Data Analytics means, for example, avoiding having a monolithic cluster for the Apache Hadoop/Spark ecosystem, but rather having a flexible infrastructure, on which a containerized Apache Spark cluster can easily coexist with second generation solutions such as KubeFlow, Dask or Ray. Kaptain is a solution able to follow the evolution of your data-driven applications over time, without any compromise in terms of usability and performance: a single high-performance infrastructure to put on-line a high-performance S3 object storage, host distributed platforms for ETL of huge amounts of data, train complex Data Models based on Machine and Deep Learning algorithms, thanks to the support of GPU computing, and make the “tools of the trade” accessible (workflow manager, sql and no-sql database, tools for advanced visualization and for CI / CD, …) that increase the productivity of your Data Scientists and Data Engineers.
Kaptain 1.1: technical details
Kubernetes: available versions: 1.17, 1.18 o 1.19 and configured through Rancher Kubernetes Engine (RKE) 1.2
Cloud Native Distributed Block Storage: Longhorn 1.1 or, alternatively, Rook-Ceph 1.5
Web based Cluster Manager/Explorer: Rancher Server 2.5
Bare Metal Load Balancer: MetalLB 0.9.5
Application Catalog: based on Helm3 package manager
Supported Client Apps: kubectl, rancherCLI e LENS
E4 CONTAINER PLATFORM