The chapter authors build on the standard Intel MPSS documentation that provides the information required for workstation installs, but does not provide techniques needed for successful deployment in a cluster environment. Based on multiple authors’ many years of experience managing HPC clusters and specific experience with the Intel Xeon Phi coprocessor family since the Knights Ferry Software Development Kits, these cluster deployment experts present current best practices for managing Intel Xeon Phi coprocessors in a cluster. This chapter is a valuable resource for those who wish to manage cluster deployments that utilize Intel Xeon Phi coprocessors.
Chapter Authors
Paul Peltz Jr. is an HPC Systems Administrator for the National Institute for Computational Sciences at the University of Tennessee. He graduated from the University of Tennessee in 2003 and had been working at the Innovative Computing Laboratory as Dr. Jack Dongarra’s System Administrator for 14 years before joining NICS in 2013. He is the lead Administrator on the Beacon Cluster and is the Operations mentor for the NICS Student Cluster Challenge team at SC.
Troy Baer holds bachelor’s and master’s degrees in aerospace engineering from the Ohio State University and is now a senior HPC system administrator at the National Institute for Computational Sciences where he currently leads the HPC systems team within the NICS Systems and Operations group. He has been involved in several system deployments at NICS, including Beacon, Nautilus, and Kraken. Before coming to NICS in 2008, Mr. Baer spent 10 years as a supercomputing support engineering at the Ohio Supercomputer Center. In April 2014 Mr. Baer received the Adaptive Computing Lifetime Achievement award for contributions in scheduling and resource management using Moab that have helped Kraken—NICS’ flagship computing resource and the first academic computer to break the petaflop barrier—achieve outstanding 90–95% utilization rates since 2010.
Ryan Braby is the Chief Cyberinfrastructure Officer for the Joint Institute for Computational Sciences at the University of Tennessee. Ryan has worked in HPC Administration and Integration for over 15 years. In this time he has been directly involved in the administration and/or deployment of two systems that ranked #1 on the Top 500 list, one system that ranked #1 on the Green 500 list, and 18 systems that were ranked in the top 50 on the Top 500 list. Ryan has experience administering and integrating multiple platforms including; IBM SP systems, Linux clusters, BlueGene systems, and Cray XT systems. Ryan has experience with all facets of HPC systems including; parallel file systems, high performance interconnects, resource managers, and batch schedulers.
Dr. Vincent Charles Betro is a Computational Scientist at the National Institute for Computational Sciences at Oak Ridge National Laboratory, where he focuses his research on porting and optimizing applications for several accelerator architectures, especially the Intel Xeon Phi, and developing Computational Fluid Dynamics codes. He also works with XSEDE users to optimize their codes for Kraken, a petascale academic machine. He received his Ph.D. in Computational Engineering from the University of Tennessee SimCenter at Chattanooga in 2010, where he became research faculty and the STEM Outreach coordinator. His CFD research was in Cartesian Hierarchical Adaptive Anisotropic Mesh Generation, and he continues to work with the Meshing, Visualization, and Computational Environments Technical Committee for the American Institute of Aeronautics and Astronautics as a board member. Additionally, due to his background as a middle and high school mathematics teacher and college mathematics instructor, Vince enjoys working with students to broaden their understanding of and interest in STEM careers.
Glenn Brook currently directs the Application Acceleration Center of Excellence (AACE) and serves as the Chief Technology Officer at the Joint Institute for Computational Sciences between the University of Tennessee (UT) and Oak Ridge National Laboratory. He is the principal investigator for the Beacon Project, which is funded by NSF and UT to explore the impact of emerging computing technologies such as the Intel Xeon Phi coprocessor on computational science and engineering. He received his Ph.D. in Computational Engineering from UT in 2008.
Karl W. Schulz received his Ph.D. in Aerospace Engineering from the University of Texas in 1999. After completing a one-‐year post-‐doc, he transitioned to industry working for the CD-‐Adapco group to develop and support commercial engineering software in the field of computational fluid dynamics (CFD). After several years in industry, Karl returned to the University of Texas in 2003, joining the research staff at the Texas Advanced Computing Center (TACC). During his time at TACC, Karl was actively engaged in HPC research, scientific curriculum development and teaching, technology evaluation and integration, and strategic initiatives serving on the Center’s leadership team as an Associate Director and leading TACC’s HPC and Scientific Applications groups during his tenure. He was a Co-‐principal investigator on multiple Top-‐25 system deployments serving as an application scientist and principal architect for the cluster management software and HPC environment. Karl joined the Technical Computing Group at Intel in January 2014 and leads a Cluster-‐Maker team working on future generation HPC software products.
Click to see the overview article “Teaching The World About Intel Xeon Phi” that contains a list of TechEnablement links about why each chapter is considered a “Parallelism Pearl” plus information about James Reinders and Jim Jeffers, the editors of High Performance Parallelism Pearls.
Leave a Reply