18Nov 2019

The Open Edge and HPC Initiative, CINECA and E4 Computer Engineering Report the Success of the CODES@OEHI hackathon on 64-bit Arm®v8 based Clusters

The Open Edge and HPC Initiative, CINECA and E4 Computer Engineering SpA are excited to report the success of the CODES@OEHI hackathon organized on Oct 28th/29th at CINECA on 64-bit Arm®v8 based clusters.

cat ./Participating_Institutions
The Open Edge and HPC Initiative
E4 Computing Engineering SpA
Huawei Technologies Duesseldorf GmbH
Jülich Supercomputing Centre

cat ./WHY_CODES@OEHI_hackathon 

In High Performance Computing (HPC), there is a continued need for higher computational performance. Arm-based processor technologies are now offering competitive performance and on the path towards providing even much better solutions compared to what is otherwise available. To leverage these new technologies it is necessary to familiarize users and code developers in the 64-bit Arm®v8 programming environment and development tools, to ease the porting of applications.

cat ./GOALS_CODES@OEHI_hackathon

The goals of the 2-day hands-on workshop was for current or prospective users to familiarize with the 64-bit Arm®v8 programming environment by porting their applications or to further optimize the already ported applications.

cat ./TESTBEDS_CODES@OEHI_hackathon 

Free access was provided to two 64-bit Arm®v8 based clusters:
• ARMIDA (ARM Infrastructure for the Development of Applications) cluster, 8 dual socket nodes based on Marvell® ThunderX2® processors located at E4 Computer Engineering’s premises
• JUAWEI cluster, 28 dual socket nodes based on the Huawei Hi1616 processors located at Jülich Supercomputing Centre

cat ./TUTOR#1_CODES@OEHI_hackathon 
Ollie Perks, Arm 

I was fortunate to be a technical tutor of the CODES@OEHI hackathon. I really believe that the hackathon was a great opportunity for me to share my knowledge and educate users on the Arm HPC offering, but also to learn from users on what they are looking for in term of their programming needs and optimization needs. The interesting thing is that, as a tutor of this hackathon, I was surprised that the participants were so quick in learning the optimization techniques and implementing these techniques to their applications. Codes were ported in matter of hours and in some cases of minutes, and profiling was quickly done. The optimization of the code was a bit more demanding in term of time, but the skills and focus of the participants made up to achieve a good level of performance just in the first session of the hands-on. This event also helped identify a few performance issues with the Arm HPC tools that we were able to feed back to the development team, to improve the tools for future users.

cat ./TUTOR#2_CODES@OEHI_hackathon
Phil Ridley, Arm

I was really impressed by the talents of the participants, especially by their in-depth knowledge of the programming tools. All of them were aware of how to use the development tools and the right framework. The primary objective of the CODES@OEHI wasn’t to achieve top performance with the applications but to familiarize the participants with the programming environments and tools. However, the majority of participants were able to obtain performance with their applications at a level which was as good as, or even better than, the systems available to them in their normal working environment. 

cat ./ORGANIZER#1_CODES@OEHI_hackathon
Carlo Cavazzoni, CINECA 

CINECA has a long history in using 64-bit Arm®v8 based clusters. In 2015 CINECA and E4 co-developed the first ARM+GPU cluster within the PRACE-3IP PCP: Whole-System Design for Energy Efficient HPC program. In 2018/2019 CINECA performed extensive tests on the HPC cluster CARMEN (Cineca ARM ENablement), based on 8 dual socket Marvell® ThunderX2® nodes connected via EDR Infiniband High Speed switch. Access to the cluster was provided to application developers and users of scientific and engineering workloads (OpenFOAM, VASP, QuantumEspresso, GROMACS, Lattice Boltzmann codes and many others). The particular interest of CINECA towards 64-bit Arm®v8 based clusters is for preparing and enabling the transition to exascale of the flagship codes and workflows used by the material science community, in line with the membership of CINECA in MaX, one of the nine ‘European Centres of Excellence for HPC applications’.

cat ./ORGANIZER#2_CODES@OEHI_hackathon
Cosimo Damiano Gianfreda, E4 Computer Engineering SpA 

E4 Computer Engineering is always at the leading edge of the technology curve, and is honored to have supported the OEHI and CINECA in the hackathon. E4 has designed its first Arm-based cluster in 2010 and is currently refreshing its line of products based on 64-bit Arm®v8 processors to add the next-gen of the ThunderX family. The invaluable data gathered during the hackathon enables E4 to better define the specs of its products by applying a co-design approach targeted to achieve the optimal configuration for any demanding scientific and industrial requirements.

cat ./PARTICIPANT#1_CODES@OEHI_hackathon
Enrico Calore, INFN 

The CODES@OEHI Hackathon has been very useful to me for two main reasons: it allowed me to have access to two 64-bit Arm-v8 based machines and I had the opportunity to talk to experts, who have in-depth knowledge of this architecture, and getting their support in optimizing and running my codes on these systems.

cat ./Marvell_ThunderX2

Marvell® ThunderX2® is the second generation of the company's Arm®v8 based server processors supporting dual socket configurations and optimized to deliver the highest computational performance along with balanced IO connectivity, memory bandwidth and capacity. The Marvell® ThunderX2® processor family is fully compliant with Arm®v8–A architecture specifications and is optimized to drive high computational performance by delivering outstanding bandwidth and memory capacity. These features create an environment that is well suited to run computationally intensive HPC workloads.

