BlackJeans, 10 years of operational weather forecasting.
We are pleased to present to our readers a very interesting article by Raffaele Montella, Assistant Professor of Computer Science, at the Department of Science and Technology of the University of Naples “Parthenope”, who tells the story of BlackJeans, a HPC system used for operational weather forecasting.
The story about how the Center for Marine and Atmosphere Monitoring and Modelling (CMMMA, http://meteo.uniparthenope.it, now friendly nicknamed meteo@uniparthenope as its mobile apps) run by the University of Naples “Parthenope” has become one of the most active and productive Italian local weather prediction facilities and the host of its high-performance computing resources was already told many times, but let we consider this one the extended version, a sort of old classical movie to watch during the cold winter nights beside a fireplace waiting for the end of a storm front.
Like all the stories of this kind, the very beginning is set in a garage or a basement or a mix of both, and this one is not an exception: it is a pretty cold late afternoon not so far from the 1997 Christmas Eve, when a very young version of me, curious, tired and satisfied for his contribution to the cause, is the witness of an event become legendary in my memory: the switch on of my very first Beowulf in a lab of the Center for Parallel computations and Supercomputing (CPS part of CNR, Italian National Research Council).
It was great for me, just a student at the time, teaming up with experienced people to stick together 16 Pentium Pro desktops with a 100 megabit ethernet running Linux in order to compute matrix-matrix multiplications in a never-seen-before short amount of time, but, believe me, or not, it was nothing if compared to what I felt when an about a decade and many megaflops later BlackJeans run its first submitted job.
“Imagine a large hall like a theatre, except that the circles and galleries go right round through the space usually occupied by the stage. The walls of this chamber are painted to form a map of the globe. The ceiling represents the north polar regions, England is in the gallery, the tropics in the upper circle, Australia on the dress circle and the Antarctic in the pit.”
The stone written words authored by Lewis Fry Richardson in 1922 are my preferred oopart: the early last century math genius is the guy who invented the numerical weather prediction. Although the jinx messed up his life and academic career because being a pacifist and a differently than expected sexual-oriented person in Great Britain at the beginning of 1900 was totally, totally complicated, he is my superhero for a single simple reason: he mentions computers solving equations in parallel in an age far away from the digital computation and really close to the steampunk-inspiring victorian age.
“A myriad computers are at work upon the weather of the part of the map where each sits, but each computer attends only to one equation or part of an equation. The work of each region is coordinated by an official of higher rank. Numerous little “night signs” display the instantaneous values so that neighbouring computers can read them. Each number is thus displayed in three adjacent zones so as to maintain communication to the North and South on the map.”
This makes Richardson’s paper my most beloved source of quotes.
Unfortunately for Lewis, at the time of his writings the “myriad of computers” were just humans dedicated to hand computations, but this was not what happened a Thursday morning of October 2010: the 144 CPU cores and the over 5000 GPU cores (yes, myriad of computers were finally a real thing to play with) performed the first operational clock cycles: HPC-GPU BlackJeans by E4 Computer Engineering, founded by an abundant grant by the Campania Region, was finally up and running, ready to produce operational high-resolution local weather forecast for the University of Naples Parthenope CCMMMA (the extra C, for Campania, was issued for devotion to the main fund source, Centro Campano per il Monioraggio e la Modellistica Marina e Atmosferica).
The CCMMMA or “The Weather Center” as people involved in it usually refers to, was the realization of a long-time dream project of the prof. Giancarlo Spezie, a world-famous full professor in oceanography and veteran of Antarctic explorations, serving as a chief executive officer, and of the prof. Giulio Giunta, a full professor in scientific calculus, designed as principal investigator. Besides these two granite-like strong scientific pillars, a group of researchers experienced in meteorology, oceanography, computation and computer science. I’m very happy the Ph.D. thesis I defended a few years before about environmental modeling and grid computing techniques can work as a support to the operational workflow for diverse and different atmosphere and ocean predictions needed by the Weather Center.
Yes, I’m in! No more tricky lab-made/home-made Beowulf clusters or scavenged computational resources to run my weather/marine prediction workflows as done, sometimes officially, sometimes… not really, since 2001.
BlackJeans is the short name for HPC-GPU BlackJeans because each one of its 12 computing nodes is equipped with a shiny new, for the time, NVidia Tesla M2050, with unchained 448 CUDA cores ready to accelerate a couple of 6 cores Xeon CPUs (for a whole sum of 144 computing cores). The 14 TB storage device and a dedicated web server designed to host the services devoted to making the computed resources open to the public complete the BlackJeans design.
Wait: did I mention why the Weather Center computational bimbo has been called BlackJeans? Apparently not, but the story in the story is nice and dated back to November 2007 when I had the chance to visit the building hosting the supercomputing Blue Gene P some weeks before its public release. Almost at the same time, I had to buy some computing resources for my lab (High Performance Scientific Computing SmartLab, http://hpsc.uniparthenope.it) but the procurement of my dream was abruptly limited by not enough funds to buy a professionally engineered HPC cluster, so, once again, it was a commodity off the shelf hardware to be assembled. I felt as if I was wearing casual clothes in the context of people in suits: so I named this cluster BlueJeans: Blue (as Blue Gene) but Jeans, instead of Gene, because of the cost affordability (of course, I was aware of the expected computing performance probably were not the same). Months later, a new HPC Beowulf was assembled and packed by NVidia low-cost GPUs (but supporting CUDA). In honor of the GPU brand color and following the first cluster naming convention, the new cluster was named GreenJeans. It is pleonastic to tell the story about a third cluster, but its name was RedJeans (and we also had YellowJeans and PinkJeans). Finally, a real HPC unit home landed in our facility: it was powerful, shiny, and aesthetically dark as the night: BlackJeans was ready to let us know about tomorrow’s weather in an unprecedented operational resolution: up to one square kilometer!
Since the time of the launch in October 2010, the BlackJeans working day starts as the NCEP Global Forecast System (GFS) initial and boundary conditions are available for download. Yes, because about ten years have passed, but BlackJeans is still at work, minimally aged, but even more powerful thanks to an upgrade plan joint with periodic maintenance actions fully supported by the University technical team and, above all, the E4 engineers.
Once the GFS data are downloaded, the computation workflow based on the DagOnStar engine (http://github.com/dagonstar) begins performing the data pre-processing needed by the Weather Research and Forecasting (WRF). It is a worldwide used, next-generation, mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications. It features two dynamical cores, a data assimilation system, and a software architecture supporting parallel computation and system extensibility. The model serves a wide range of meteorological applications across scales from tens of meters to thousands of kilometers (https://www.mmm.ucar.edu/weather-research-and-forecasting-model). Our current model implementation is designed to reach the ground resolution of 1 Km on the south of Italy (d03WRF), while 5 Km is the resolution on Italy and surrounding seas (d02WRF), and 25 Km on the Euro-Mediterranean area (d01WRF). BlackJeans performs one run per day with data initialized at 00 Zulu, forecasting the next 168 hours.
Every 24 simulated hours, the results are published and made available to the users or to coupled models. We call this configuration as “streaming forecast”. Every day, the forecast is pushed forward by 24 hours, while the simulation of the previous 24 hours is performed using the re-analysis as initial and boundary conditions. The WRF results are moved as is in the high-performance accessible storage intimately connected to the webserver alongside a more processed output projected in regular latitude/longitude and enriched by diagnostic variables.
The WRF daily results are used to feed, “force” as better told in computational meteorological nerd lingo, other models in order to produce marine and air quality predictions. At the time of writing, we are reconfiguring the models for the air quality while the wind-driven sea wave predictions and coastal marine dynamics are produced in a routinary fashion.
WaveWatch III (WW3) is a third-generation wave model developed at NOAA/NCEP in the spirit of the WAM model. It is a further development of the model WaveWatch, as developed at the Delft University of Technology, and WaveWatch II developed at NASA, Goddard Space Flight Center. WaveWatch III, however, differs from its predecessors in many important points such as the governing equations, the model structure, the numerical methods, and the physical parameterizations. Starting from the latest versions, WaveWatch III is evolving from a wave model into a wave modeling framework, which allows for easy development of additional physical and numerical approaches to wave modeling (https://polar.ncep.noaa.gov/waves/wavewatch/).
As already done with WRF, we used three telescopic computing grids: the ground resolution on the Mediterranean area (d01WW3) is 9 Km, the resolution for the seas surrounding the Italian peninsula (d02WW3) is set as 3 Km while the central and southern Tyrrhenian sea, east sector (d03WW3), is covered by a resolution of 1 Km. The data produced by the WW3 daily forecast are saved on the fast accessible storage and offered to the users with diverse and different services.
As it happens for WW3, the Regional Ocean Model System (ROMS) is offline coupled with WRF for wind friction and fed with initial and boundary conditions produced by the Copernicus European Project. ROMS is a free-surface, terrain-following, primitive equations ocean model widely used by the scientific community for a diverse range of applications. It includes several vertical mixing schemes, multiple levels of nesting, and composed grids (https://www.myroms.org ). In our operational workflow, the configuration of ROMS is different than WRF and WW3. Because of our computational limitations, we implemented the sole d03ROMS domain covering, at the incredible resolution of about 160 meters (yes, the cells are about 160m x 160m), the geographic area between the southern Lazio Region and the northern Calabria Region spanning for more than 100 x 50 nautical miles. The three-dimensional predictions about the sea currents, the water temperature, the water salinity, and the sea surface height, about 2.5 TB of data per month, are stored for science and engineering usage and to force more coupled models.
WRF, WW3, and ROMS models generate what we call “first-level” prediction products. The workflow running of BlackJeans orchestrates the execution of models carrying out the “second-level” products: those models forecasting the weather, the sea waves, and the sea currents (and other conditions) are used to feed specific applications.
One of the most long-lasting projects is MytiluSE (Modelling mytilus farming System with Enhanced web technologies) funded by the Campania Region, Veterinary Sector, and targeted to predict the concentration of pollutants surrounding the mytilus farming area giving to the expert a tool to estimate the potential risks for the human health. The ROMS model outputs are used to feed a brand new pollutant transport and diffusion model we named WaComM (Water Community Model) alongside the Campania Region coastal pollution emission sources database. WaComM is a three-dimensional Lagrangian model, designed and implemented as an evolution of the Lagrangian Assessment for Marine Pollution 3D (LAMP3D). It is used to compute the transport and diffusion of pollutants for assessing the water quality for mussel farming and fish breeding. In WaComM, several basic algorithms have been optimized and, in order to improve its performance on a High-Performance Computing environment, some features like restarting and parallelization techniques in shared memory environments have been added. In the following, we describe the underlying mathematical model.
Pollutants are modeled as inert Lagrangian particles. No interactions with other particles or feedback are included in the model. The pollution sources are defined as a geographic location in terms of longitude, latitude, and depth, the total amount of Lagrangian particles released in one hour, and the emission profile that could be set statically or change during the simulation. (https://github.com/CCMMMA/wacomm ). The WaComM system can be used in more ways: as a decision support tool, to aid in the selection of the best suitable areas for farming activity deployment or in an ex-post fashion in order to achieve better management of offshore activities.
Other “second-level” products are (or have been): marine ingression modeling in order to evaluate the water run-up in extreme weather events and produce early warnings and a smoke tracer model able to evaluate the effects on air pollution of a wild file.
BlackJeans is restless: in a decade the downtime was neglectable. At the time of writing, the computing nodes are 22 summarising 360 CPU cores managed using different queues in order to overcome the diverse involved technologies. The 12 GPUs hosted on the original 12 nodes can be shared using GVirtuS (http://github.com/gvirtus ) in order to have transparently virtualized and remoted GPUs on the other nodes. The 40G InfiniBand is still doing its job perfectly. The data produced by the computational workflow is accessible as open resources using REST APIs (https://api.meteo.uniparthenope.it ), OPEnDAP server (https://data.meteo.uniparthenope.it/opendap/opendap ), WMS maps (http://data.meteo.uniparthenope.it/ncWMS2/ ), the web portal (https://meteo.uniparthenope.it ) or mobile applications for both iOS (https://apps.apple.com/it/app/meteo-uniparthenope/id1518001997 ) and Android (https://play.google.com/store/apps/details?id=it.uniparthenope.meteo).
We produce about 5 TB per month. The latest three months of data are accommodated in fast-access storage directly connected to the web server while the long-lasting storage facilities are ensured by a storage sub-cluster leveraging on GlusterFS offering up to 140 TB at the time of writing. In the last ten years, BlackJeans computed fine and accurate weather and marine predictions for some of the most challenging sporting events hosted in the Campania Region: America’s Cup World Series 2012 and 2013, Italian Olimpic Classes Championship 2015, National ORC Championship 2019, and the 30th Universiade 2019. And this is just a shortlist!
“In a neighbouring building there is a research department, where they invent improvements. But there is much experimenting on a small scale before any change is made in the complex routine of the computing theatre. In a basement an enthusiast is observing eddies in the liquid lining of a huge spinning bowl, but so far the arithmetic proves the better way. In another building are all the usual financial, correspondence and administrative offices. Outside are playing fields, houses, mountains and lakes, for it was thought that those who compute the weather should breathe of it freely.”
Again, the words of Lewis Fry Richardson sound as prophetic: in the world where the information is the power and data is the gold coin of the realm, the CMMMA (the Weather Center, meteo@uniparthenope), now lead by professor Giorgio Budillon, full professor in oceanography, a veteran of Antarctic explorations and director of the Department of Science and Technologies, has accepted a new challenge. Big data and machine learning techniques are playing a major role in weather/marine/pollution predictions.
BlackJeans operational duty now covers the computational workflow for predictions and the instruments acquired data processing, management, and storage (two mid-range weather radar in X band for example) while the scientific research on this hot topics is performed by the interdisciplinary team of the Neptun-IA Lab using the just one-year-old HPC-GPU-DNN PurpleJeans, but this is another story about other “magnifiche sorti e progressive”…
About the Author
Raffaele Montella works as an assistant professor, with tenure, in Computer Science at the Department of Science and Technologies, University of Naples “Parthenope”, Italy since 2005. He got his degree (MSc equivalent) in (Marine) Environmental Science at the University of Naples “Parthenope” in 1998 defending a thesis about the “Development of a GIS system for marine applications” scoring “cum laude” and an award mention to his study career. He defended his Ph.D. thesis about “Environmental modeling and Grid Computing techniques” earning the Ph.D. in Marine Science and Engineering at the University of Naples “Federico II”.
The research main topics and the scientific production are focused on tools for high-performance computing, such as grid, cloud and GPUs with applications in the field of computational environmental science (multidimensional big data/distributed computing for modeling and scientific workflows and science gateways) leveraging on his previous (and still ongoing) experiences in embedded/mobile/wearable/pervasive computing and internet of things. He joined the CI/RDCEP of the University of Chicago as Visiting Scholar and as a Visiting Assistant Professor working on the FACE-IT project. He leads the IT infrastructure of the University of Naples “Parthenope” Centre for Marine and Atmosphere Monitoring and Modelling (CMMMA). He technically led the University of Naples “Parthenope” research units of the European Project “Heterogeneous secure multi-level Remote Acceleration service for low-Power Integrated systems and Devices (RAPID)” focusing on GVirtuS development and integration (General purpose Virtualization Service) enabling CUDA kernel execution on mobile and embedded devices. He leads the locally funded project: “Modelling mytilus farming System with Enhanced web technologies (MytiluSE)” focused on high-performance computing-based coupled simulations for mussel farms food quality prediction and assessment for human gastric disease mitigation. He leads the research prototype project: “DYNAMO: Distributed leisure Yacht-carried sensor-Network for Atmosphere and Marine data crOwdsourcing applications” targeting the coastal marine data gathering as crowdsourcing for environmental protection, development, and management. He leads UNP unit of the Erasmus+ Project “Framework for Gamified Programming Education” (FGPE). He leads UNP/CINI unit of the Euro-HPC H2020 project “Adaptive multi-tier intelligent data manager for Exascale” (ADMIRE).
Skamarock, William C., Joseph B. Klemp, and Jimy Dudhia. “Prototypes for the WRF (Weather Research and Forecasting) model.” In Preprints, Ninth Conf. Mesoscale Processes, J11–J15, Amer. Meteorol. Soc., Fort Lauderdale, FL. 2001.
Tolman, Hendrik L. “Distributed-memory concepts in the wave model WAVEWATCH III.” Parallel Computing 28, no. 1 (2002): 35-52.
Wilkin, John L., Hernan G. Arango, Dale B. Haidvogel, C. Sage Lichtenwalner, Scott M. Glenn, and Katherine S. Hedström. “A regional ocean modeling system for the Long‐term Ecosystem Observatory.” Journal of Geophysical Research: Oceans 110, no. C6 (2005).
Montella, Raffaele, Diana Di Luccio, Pasquale Troiano, Angelo Riccio, Alison Brizius, and Ian Foster. “WaComM: A parallel Water quality Community Model for pollutant transport and dispersion operational predictions.” In 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 717-724. IEEE, 2016.
Sánchez-Gallegos, Dante D., Diana Di Luccio, J. L. Gonzalez-Compean, and Raffaele Montella. “A Microservice-Based Building Block Approach for Scientific Workflow Engines: Processing Large Data Volumes with DagOnStar.” In 2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 368-375. IEEE, 2019.
Laccetti, Giuliano, Raffaele Montella, Carlo Palmieri, and Valentina Pelliccia. “The high performance internet of things: using GVirtuS to share high-end GPUs with ARM based cluster computing nodes.” In International Conference on Parallel Processing and Applied Mathematics, pp. 734-744. Springer, Berlin, Heidelberg, 2013.
Di Luccio, Diana, Guido Benassai, Giorgio Budillon, Luigi Mucerino, Raffaele Montella, and Eugenio Pugliese Carratelli. “Wave run-up prediction and observation in a micro-tidal beach.” Natural Hazards & Earth System Sciences 18, no. 11 (2018).
Montella, Raffaele, Diana Di Luccio, Angelo Ciaramella, and Ian Foster. “StormSeeker: A Machine-Learning-Based Mediterranean Storm Tracer.” In International Conference on Internet and Distributed Computing Systems, pp. 444-456. Springer, Cham, 2019.
Di Luccio, D., Benassai, G., Mucerino, L., Montella, R., Conversano, F., Pugliano, G., … & Budillon, G. (2020). Characterization of beach run-up patterns in Bagnoli bay during ABBACO project. Chemistry and Ecology, 36(6), 619-636.
Di Luccio, D., Benassai, G., De Stefano, M., & Montella, R. Evidences of atmospheric pressure drop and sea level alteration in the Ligurian Sea. In IMEKO TC-19 International Workshop on Metrology for the SeanGenoa, Italy, October 3-5, 2019.