An advisory board for the new high-performance computing hub will make making decision-making a collective experience, says Dr. Vladimir Florinski.
Michael Mercier / UAH
The University of Alabama in Huntsville (UAH), a part of the University of Alabama System, will become the Alabama hub for statewide high-performance computing (HPC) under a nearly $1 million two-year National Science Foundation (NSF) grant.
“This funding will allow us to establish a high-performance computing facility to serve a consortium of 10 Alabama universities, both public and private,” says Dr. Vladmir Florinski, the principal investigator for the effort that attracted $972,261 from NSF’s Established Program to Stimulate Competitive Research (EPSCoR).
The proposal leveraged the UAH-managed NSF EPSCoR Track-1 Future Technologies enabled by Plasma Processes (FTPP) and Connecting the Plasma Universe to Plasma Technology in Alabama grants to develop the HPC consortium and relate it to statewide FTPP needs for HPC support.
“As the host site, UAH will ultimately have control over the hardware design, use policy and resource allocation,” says Dr. Florinski, a professor of space science and researcher at UAH’s Center for Space Plasma and Aeronomic Research (CSPAR), which is one of the most prolific HPC users on campus.
“However, we plan to invite representatives from each participating institution to serve on an advisory board, thus making decision-making a collective experience,” he says. “As a result, UAH will forge closer ties with the other state colleges that would enable more collaborations on science and engineering projects involving computer modeling or data analysis.”
The Advisory Board will meet remotely every semester and make recommendations on how to share computing and storage resources among the numerous research entities. Resource distribution could be based on groups, projects or individual users.
“Faculty, researchers and students from every participating university will be able to apply for accounts and run their applications on the system,” says Dr. Florinski. “A central web portal will be created to simplify account applications.”
Eighty percent of the grant money will purchase a new HPC cluster that will dramatically improve the computing capabilities of the consortium, he says. Once deployed in the server room in Cramer Research Hall, the system will consist of three to four racks densely packed with individual servers called nodes. Each node will have 64-128 central processing unit cores, for a total of about 3,000 cores for the entire system.
“The real power of the machine will come from its graphics processor subsystem, consisting of 20-24 Nvidia Ampere units with a total of some 160,000 CUDA cores,” Dr. Florinski says. “The theoretical maximum double-precision performance will be in the range of 240-360 teraflops.”
Users across the state will connect to the new flagship HPC cluster remotely using a secure shell protocol. The project will mostly rely on existing network connections between the sites. CSPAR information technology personnel will be responsible for operation and maintenance of the new HPC system.
Prior to bringing the new facility online, a series of network bandwidth tests will be performed to determine connection speeds between the hub and its users. UAH’s Office of the Vice President for Research and Economic Development and CSPAR personnel will work with UAH’s Office of Information Technology at each campus to optimize routing to achieve the best throughput possible.
Before other users are allowed, resource sharing policies and support structure must be implemented and tested.
“The system will be integrated into a national federation network allowing resource sharing with users outside the state, as required by NSF’s policy,” says Dr. Florinski. “As the system enters production, my role will shift to helping the participating institutions – especially those that serve minority groups – to develop and improve their HPC capabilities to take full advantage of this important resource.”
He says the success of the project is dependent on the existence of regional fiber optic networks.
“This includes the Alabama Research and Education Network and also the University of Alabama System Regional Optical Network and Southern Crossroads of the Georgia Institute of Technology,” he says.
UAH’s College of Science and College of Engineering teamed in the grant proposal. Departments providing a list of projects that would benefit their research include Biological Sciences, Chemical and Materials Engineering, Computer Science, Mechanical and Aerospace Engineering, and Space Science.
For CSPAR, the new system will allow performance of much larger physics-based numerical simulations, ranging from space weather forecasting to cosmic ray science that previously could only be run at national level supercomputing facilities.
“We hope, naturally, to attract HPC users from other departments as well,” Dr. Florinski says. “In addition to that, the Department of Space Science plans to introduce a graduate certificate in computational physics that will involve the new system in an educational capacity. This offers an opportunity to collaborate with the departments of Computer Science and Electrical and Computer Engineering on the new curriculum.”
UAH is partnering with several existing programs to introduce graduate and undergraduate students to HPC in the context of their summer research projects, he says.
“We also plan to reach out to current and potential HPC users across the state by offering a series of webinars focusing on certain advanced topics, such as the use of graphics processing units in a distributed memory environment.”