Quick Information
In order to meet the demands of high-performance computing (HPC) researchers, large-scale computational and storage machines require many staff members who design, install, and maintain these systems. These HPC systems professionals include system engineers, system administrators, network administrators, storage administrators and operations staff who face problems that are unique to high performance computing systems. While many conferences exist for the HPC field and the system administration field, none exist that focus on the needs of HPC systems professionals. Support resources can be difficult to find to help with the issues encountered in this specialized field. Often times, systems staff turn to the community as a support resource and opportunities to strengthen and grow those relationships are highly beneficial.
This Symposium is designed to share solutions to common problems, provide a platform to discuss upcoming technologies, and to present state of the practice techniques so that HPC centers will get a better return on their investment, increase performance and reliability of systems, and researchers will be more productive. Additionally, this Symposium is affiliated with the systems professionals' chapter of the ACM SIGHPC (SIGHPC SYSPROS Virtual ACM Chapter). This session would serve as an opportunity for chapter members to meet face-to-face, discuss the chapter's yearly workshop held at SC, and continue building our community's shared knowledge base. For more information, check out our upcoming activities.
Invited Talks
This half-day workshop combines invited case-study presentations with structured audience interaction. The session is designed to move from an overview of the post-xCAT management landscape to concrete migration experiences, culminating in a participatory discussion that compares alternative approaches. The workshop will feature four invited talks, each presenting a distinct transition strategy currently in use at academic or industry HPC centers. These presentations will be followed by an interactive, moderated town hall in which speakers and participants collaboratively populate a feature comparison matrix, examining tradeoffs related to scalability, complexity, sustainability, learning curve, and community support.
- OpenCHAMI: An API-driven, microservices-based approach to cluster state management
- Altair/Siemens Navops: Intelligent workload and infrastructure orchestration with cloud-native provisioning for HPC
- Ansible/Cobbler/Satellite: Leveraging enterprise-grade automation for HPC-specific workflows
- Warewulf: Container-centric, stateless provisioning for HPC environments
Speaker Bios
- Taylor Rose Graham - Purdue: Taylor is an Associate Research Solutions Hardware Engineer at Purdue University, specializing in high-performance computing (HPC) server hardware and infrastructure. With a strong interest in growing on the compute side, Taylor is actively building skills in Linux, Bash, and system administration to deepen their impact across the HPC stack. She is also engaged in community-building and outreach efforts, including initiatives that support diversity, education, and professional development in computing. Taylor loves learning new technologies and continually expanding her skill set to better support scalable, reliable research computing systems.
- Betsy Hillery - Purdue: Elizabett (“Betsy”) Hillery is the Director of Data Initiatives at Purdue University’s Lilly Purdue Research Alliance Center (LPRC), where she leads the LIFE@Scale platform and data integration strategies supporting life sciences research at scale within a major university industry collaboration. She serves as Chair of the ACM SIGHPC Systems Professionals (SYSPROS) Virtual Chapter, providing leadership for a global community of HPC systems professionals. As Principal Investigator for Purdue’s NSF SCIPE award, she leads workforce development initiatives focused on building inclusive and sustainable pathways for Cyberinfrastructure Professionals. Betsy brings over 20 years of experience in information technology, including eight years in high-performance computing, and holds degrees in Computer Science, Industrial-Organizational Psychology, and a Master's in Business Administration.
- Travis Cotton - LANL: Travis Cotton is an HPC engineer and technical lead at Los Alamos National Laboratory with over a decade of experience in large-scale system deployment and cluster lifecycle management. He serves as team lead for OpenCHAMI, an open-source HPC system management ecosystem, where he drives both development and production deployments at national-lab scale. His work spans CI/CD pipeline design, containerized infrastructure, and AI platform support. Travis holds an M.S. in Computer Science with a concentration in Bioinformatics from New Mexico State University.
- Gavin Burris - UPenn: Gavin Burris supports research computing at The Wharton School, where he enables faculty research in the social sciences through high-performance computing, data storage, and advanced research infrastructure. A longtime Campus Champion, he is passionate about advancing computational research. Outside of work, he is a connoisseur of bad cinema, and enjoys hiking, trail running, and creating music with synthesizers and drum machines.
- Matt Guidry - Georgia Tech: Matt brings over 20 years of experience in managing large-scale Linux systems at various academic institutions. Previously, he served as the lead Linux system administrator for the University System of Georgia, where he oversaw the configuration of Cisco UCX hardware, VMware clusters, Oracle databases, and Single Sign-On services. At PACE, Matt is the primary administrator for RedHat Satellite and SaltStack, ensuring consistent imaging and configuration management across all clusters. He also plays a key role in maintaining the SLURM scheduler services, troubleshooting and enhancing job performance.
- Alexi Kotelnikov - Rutgers: Alexei Kotelnikov supports research computing and teaches computational courses at the School of Engineering, Rutgers University. He holds a Ph.D. in Physics, specializing in Computational Fluid Dynamics (CFD), high-performance computing (HPC), and advanced engineering simulations. Alongside his academic role, he consults for the financial quantitative industry, advising institutions on automated cluster management and scalable infrastructure deployment. His primary technical interests focus on cluster file systems and scalable diskless installations.
Schedule
| Time (EDT) | Title | Presenter |
|---|---|---|
| 9:00-9:15am | Intro | |
| 9:15-9:45am | OpenCHAMI | Travis Cotton, LANL |
| 9:45-10:15am | Altair/Siemens Navops | Gavin Burris, UPenn |
| 10:15-10:30am | Q&A | |
| 10:30-11:00am | Break | |
| 11:00-11:30am | Ansible/Cobbler/Satellite | Matt Guidry, Georgia Tech |
| 11:30-12:00pm | Warewulf | Alexei Kotelnikov, Rutgers |
| 12:00-12:30pm | Speaker Panel |
Organizing Committee
| Name | Affiliation |
|---|---|
| Taylor Rose Graham | Purdue |
| Betsy Hillery | Purdue |
| John Blaas | Lambda |
| David Clifton | ANSYS |
| John Legato | Independent |
| Jay Blair | ASRC Federal |
Contact Information
If you need to contact us, send email to SIGHPC SYSPROS.
Links
- SC HPC Sysadmin Mailing List - you should join!
- Email us to get an invite to the SIGHPC SYSPROS Slack team
- Upcoming activities including our workshop at SC26