HPCSYSPROS20

SIGHPC Systems Professionals Workshop

HPCSYSPROS20

Friday November 13, 2020

10 AM to 2 PM EST

Atlanta, GA (virtual)

Held in conjunction with

and in cooperation with

Quick Information

Supercomputing systems present complex challenges to personnel who design, deploy and maintain these systems. Standing up these systems and keeping them running require novel solutions that are unique to high performance computing. The success of any supercomputing center depends on stable and reliable systems, and HPC Systems Professionals are crucial to that success.

The Fifth Annual HPC Systems Professionals Workshop will bring together systems administrators, systems architects, and systems analysts in order to share best practices, discuss cutting-edge technologies, and advance the state-of-the-practice for HPC systems. This CFP requests that participants submit either papers, slide presentations, or 5-minute Lightning Talk proposals along with reproducible artifacts (code segments, test suites, configuration management templates) which can be made available to the community for use.

Keynote Speaker

Atsuya Uno will be presenting Introduction of Supercomputer Fugaku

Fugaku is the most powerful supercomputer currently deployed ranking #1 on the TOP500 list as of June 2020. Powered by Fujitsu’s 48-core A64FX SoC, becoming the first number one system on the list to be powered by ARM processors. In single or further reduced precision, which are often used in machine learning and AI applications, Fugaku’s peak performance is over 1,000 petaflops (1 exaflops). The new system is installed at RIKEN Center for Computational Science (R-CCS) in Kobe, Japan.

Dr. Atsuya Uno is a unit leader of System Operations and Development Unit in Operations and Computer Technologies Division at RIKEN Center for Computational Science (R-CCS). He joined RIKEN in 2007, and he was involved in the development and the operation of the K computer. His responsibilities at R-CCS are the operaiton of the supercopmuter Fugaku.

Schedule

Check out our proceedings.

All times in Eastern Standard Time.

Start	End	Description
10:00 AM	10:05 AM	Welcome
10:05 AM	10:40 AM	Keynote: Introduction of Supercomputer Fugaku, Atsuya Uno, Riken Center for Computational Science
10:40 AM	10:50 AM	Site Report: NVIDIA, Adam DeConinck, NVIDIA
10:50 AM	11:00 AM	Site Report: MARCC, Jaime Combariza, Johns Hopkins University
11:00 AM	11:10 AM	Site Report: NREL, Matt Bidwell, National Renewable Energy Laboratory
11:10 AM	11:20 AM	Site Report: INL, Ben Nickell, Idaho National Laboratory
11:20 AM	11:25 AM	Lightning Talk: Setup and management of a small national computational facility: What we’ve learned the first 10 years, George Tsouloupas, Thekla Loizou, Panayiotis Vorkas, The Cyprus Institute
11:25 AM	11:30 AM	Lightning Talk: Case Study of TCP/IP tunings for High Performance Interconnects, Jenett Tillotson , National Center for Atmospheric Research
11:30 AM	11:35 AM	Lightning Talk: NGC Container Environment Modules, Scott McMillan , NVIDIA
11:35 AM	11:50 AM	Break
11:50 AM	12:10 PM	Paper: Application Performance in the Frontera Acceptance Process, Richard Todd Evans, Texas Advanced Computing Center, University of Texas at Austin
12:10 PM	12:30 PM	Paper: Parallelized Data Replication of Multi-Petabyte Storage Systems, Honwai Leong, Daniel Richards, Andrew Janke, Stephen Kolmann, Data Direct Networks and The University of Sydney
12:30 PM	12:50 PM	Paper: Log-Based Identification, Classification, and Behavior Prediction of HPC Applications, Ryan D. Lewis, Zhengchun Liu, Rajkumar Kettimuthu, Michael E. Papka, Northern Illinois University and Argonne National Laboratory
12:50 PM	1:10 PM	Paper: Modernizing the HPC System Software Stack, Benjamin S. Allen, Matthew A. Ezell, Paul Peltz, Douglas Jacobsen, Cory Lueninghoener, J. Lowell Wofford, Eric Roman, Argonne National Laboratory, Oak Ridge National Laboratory, Lawrence Berkeley National Laboratory, Los Alamos National Laboratory,
1:10 PM	1:50 PM	Panel: Cluster Management, Reese Baird (OpenHPC), Andrew Bruno (University of Buffalo), Erik Jacobson (HPE)
1:50 PM	1:55 PM	Traxler Family Award for Community Service
1:55 PM	2:00 PM	Closing Remarks

Topics of Interest

Here are some topics of interest for this group. Note that these are here to indicate direction, not to disallow other related topics.

Cluster, configuration, or software management
Performance tuning/Benchmarking
Resource manager and job scheduler configuration
Monitoring/Mean-time-to-failure/ROI/Resource utilization
Virtualization/Clouds
Designing and troubleshooting HPC interconnects
Designing and maintaining HPC storage solutions
Cybersecurity and data protection
Cluster storage

Example paper ideas might be:

Best practices for job scheduler configuration
Advantages of cluster automation
Managing software on HPC clusters

Calendar

Event	Date
Workshop Submissions Open	April 29, 2020
Workshop Submission Close	September 9, 2020
Reviews Sent	September 18, 2020
Acceptance Notification	October 2, 2020

Organizing Committee

Position	Name	Affiliation
Workshop Chair	Matt Bidwell	National Renewable Energy Laboratory
Program Chair	Gary Jackson	Johns Hopkins University Applied Physics Laboratory
Organizing Committee
	John Blaas	National Center for Atmospheric Research
	David Clifton	Ansys
	Stephen Lien Harrell	Texas Advanced Comptuing Center
	John Legato	National Institutes of Health
	Kurt Maier	Pacific Northwest National Laboratory
	William Scullin	Laboratory for Laser Energetics
	Jenett Tillotson	National Center for Atmospheric Research

Program Committee

Name	Affiliation
Jonathon Anderson	University of Colorado Boulder
Adam DeConinck	NVIDIA
Violeta Holmes	University of Huddersfield
Gary Jackson	John Hopkins University, Applied Physics Laboratory
John Legato	National Institutes of Health
Ti Leggett	Argonne National Laboratory
Hon Wai Leong	DDN
Scott McMillian	NVIDIA
Ken Schmidt	Pacific Northwest National Laboratory
William Scullin	University of Rochester, Laboratory for Laser Energetics
Jenett Tillotson	National Center for Atmospheric Research

Publication Information

All accepted papers and artifacts will be published on GitHub and archived with a DOI in Zenodo. You can view last years accepted papers here HPCSYSPROS SC19 Workshop Proceedings

Contact Information

If you need to contact us, send email to SIGHPC SYSPROS.

SIGHPC Systems Professionals Workshop

HPCSYSPROS20

Friday November 13, 2020

10 AM to 2 PM EST

Atlanta, GA (virtual)

Held in conjunction with

and in cooperation with

Quick Information

Keynote Speaker

Schedule

Topics of Interest

Calendar

Organizing Committee

Program Committee

Publication Information

Contact Information

Links