Quick Information
Supercomputing systems present complex challenges to personnel who design, deploy and maintain these systems. Standing up these systems and keeping them running require novel solutions that are unique to high performance computing. The success of any supercomputing center depends on stable and reliable systems, and HPC Systems Professionals are crucial to that success.
The Eighth Annual HPC Systems Professionals Workshop will bring together systems administrators, systems architects, and systems analysts in order to share best practices, discuss cutting-edge technologies, and advance the state-of-the-practice for HPC systems. This CFP requests that participants submit either papers, slide presentations, or 5-minute Lightning Talk proposals. Additionally reproducible artifacts (code segments, test suites, configuration management templates) which can be made available to the community for use are welcome for submissions either as a standalone submission or in addition to any paper or talk submissions.
Social Event
SIGHPC SYSPROS and CaRCC Systems Facing group will be hosting a social event from 6-8pm Monday November 18th at Stats Brewpub at 300 Marietta St NW Suite 101, Atlanta, GA. Join us for some beers, banter, and sharing of war stories with other HPC Systems professionals.
Schedule
All times in Eastern Time
Start | End | Description |
---|---|---|
8:30 AM | 8:45 AM | Opening Remarks, Jared Baker(NCAR), John Blaas (Lambda) |
8:45 AM | 9:00 AM | Advancing ODA Standardization Through An Open Source Dashboard, Tim Osborne (ORNL), Rachel Palumbo (ORNL), Leah Huk (ORNL), Ryan Adamson (ORNL), Rob Jones (ORNL), Corwin Lester (ORNL) |
9:00 AM | 9:15 AM | Next-Gen HPC Status Viz: Powering Up Node Status with Next.js and the Slurm API, Johnathan Lee (Arizona State University), Jason Yalim (Arizona State University) |
9:15 AM | 10:00 AM | Beyond the Hype: Uncovering the Real I/O Needs of LLMs, Kartik Subramanian (VAST Data Inc), Glenn K. Lockwood (Microsoft Corporation) |
10:00 AM | 10:30 AM | Morning Coffee Break |
10:30 AM | 10:40 AM | Benchmarking and Continuous Performance Monitoring of HPC Resources using the XDMoD Application Kernel Module, Nikolay A. Simakov (State University of New York at Buffalo) |
10:40 AM | 10:50 AM | Kubernetes Resource Scaling via Batch Node Conversion on the Anvil Supercomputer, Erik Gough (Purdue University), Dashiell Lumas (Purdue University) |
10:50 AM | 11:00 AM | Increasing effective storage capacity with hierarchical storage management (HSM) for NCAR’s Campaign Storage, Aric Werner (NCAR), Joseph Mendoza (NCAR), Ben Kirk (NCAR) |
11:00 AM | 11:15 AM | SStack: Software Stacks for easier and cleaner software builds on HPC, Strahinja Trecakov (New Mexico State University), Nicholas Von Wolff (New Mexico State University),Mohammad Al-Tahat (New Mexico State University) |
11:15 AM | 11:30 AM | Cluster Resource Management for Sustainable and Efficient Computing, Andrei Bersatti (Georgia Institute of Technology), Aaron Jezghani (Georgia Institute of Technology) |
11:30 AM | 11:45 PM | Dynamic Login Node Resource Control and Monitoring with Arbiter 3, Jackson McKay(University of Utah), Kai Forrest(University of Utah), Paul Fischer (University of Uta) |
11:45 PM | 12:00 PM | Chapter Updates and Closing Remarks, Jared Baker (NCAR), John Blaas(Lambda) |
Topics of Interest
Here are some topics of interest for this group. Note that these are here to indicate direction, not to disallow other related topics.
- Cluster, configuration, or software management
- Cybersecurity and data protection
- Performance tuning/Benchmarking
- Resource manager and job scheduler configuration
- Monitoring/Mean-time-to-failure/ROI/Resource utilization
- HPC storage solutions
- High speed/ Low Latency networking
- Composable infrastructure and containers
- Elastic workloads or optimizations for workload types
- Web-based cluster front ends
- Challenges with AI workloads (GPU management, Interconnect, Data Movement)
Example paper ideas might be:
- Best practices for job scheduler configuration
- Advantages of cluster automation
- Managing software on HPC clusters
Calendar
Event | Date |
---|---|
Workshop Submissions Open | May 28, 2024 |
Workshop Submission Close | August 9, 2024 |
Reviews Sent and Resubmissions Open | August 23, 2024 |
Resubmission Closed | August 30, 2024 |
Final Program Notifications | September 13, 2024 |
Organizing Committee
Position | Name | Affiliation |
---|---|---|
Workshop Chair | John Blaas | Lambda |
Program Chair | Jared Baker | NCAR |
Organizing Committee | ||
Blaise Hartman | NASA | |
Betsy Hillery | Purdue University | |
David Clifton | Ansys | |
Hon Wai Leong | DDN | |
John Legato | NIH | |
Michael Hartman | Stanford University | |
Gary Skouson | Penn State University | |
Kurt Maier | PNNL | |
Stephen Fralich | Boeing |
Program Committee
Name | Affiliation |
---|---|
Emma Shaub | NCAR |
Dori Sajdak | SUNY at Buffalo |
Ben Nickell | INL |
Wyatt Madej | NCSA |
Sam Liston | University of Utah |
Josh LeVoir | Lambda |
Honwai Leong | DDN |
Jeremy Fischer | Indian University |
Eric Coulter | Georgia Institute of Technology |
Publication Information
All accepted papers and artifacts will be published on GitHub and archived with a DOI in Zenodo. You can view the previous years presentations here HPCSYSPROS SC23 Workshop Proceedings
Contact Information
If you need to contact us, send email to SIGHPC SYSPROS.
Links
- SC HPC Syspros Mailing List - you should join!
- Join our SIGHPC SYSPROS Slack team
- Email us with any questions