• Home
  • News
  • Tutorials
  • Analysis
  • About
  • Contact

TechEnablement

Education, Planning, Analysis, Code

  • CUDA
    • News
    • Tutorials
    • CUDA Study Guide
  • OpenACC
    • News
    • Tutorials
    • OpenACC Study Guide
  • Xeon Phi
    • News
    • Tutorials
    • Intel Xeon Phi Study Guide
  • OpenCL
    • News
    • Tutorials
    • OpenCL Study Guide
  • Web/Cloud
    • News
    • Tutorials
You are here: Home / Featured article / Portable Performance with OpenCL On Intel Xeon Phi

Portable Performance with OpenCL On Intel Xeon Phi

November 3, 2014 by Rob Farber Leave a Comment

This High Performance Parallelism Pearl show the potential for using the OpenCL™ standard parallel programming language to deliver portable performance on Intel Xeon Phi coprocessors, Xeon processors, and many-core devices such as GPUs from multiple vendors. This portable performance can be delivered from a single program without needing multiple versions of the code, an advantage of OpenCL over most other approaches available today. As proof of OpenCL’s ability to deliver performance portability, we describe results from the BUDE molecular docking code, which sustains over 30% of peak floating point performance on a wide variety of processors, including laptop CPUs, Xeon, Xeon Phi and GPUs. The authors also briefly discuss the relationship between OpenCL and NVIDIA’s CUDA as well as pragma-based programming such as OpenMP 4.0 and OpenACC.
Cover3D-fs8

The chapter authors present a case study for BUDE (Bristol University Docking Engine), a molecular dynamics-­based code that was ported – in its entirety – to OpenCL with the deliberate aim of delivering performance portability across a wide range of CPUs, GPUs and accelerators. This deliberate policy means that only a single source code needs to be developed and maintained, but it relies on achieving good performance for the OpenCL code on the default CPU target devices, as well as other devices including the Intel Xeon Phi coprocessor and GPUs.

Dr. Richard Sessions and a team from Bristol University have been developing BUDE for many years. BUDE employs a novel atom­‐atom based empirical free energy force field to accurately predict the relative binding free energies of interactions between two molecules. This ability means BUDE can be used to address three different problems: 1) virtual-­screening-­by-­docking of millions of small molecules against a protein target (Figure 1-­‐7); 2) binding-­‐site detection by scanning the surface of a protein with a ligand (Figure 1-­‐8); 3) protein-­‐protein docking in real space by the systematic scanning of one protein surface against the other.

BUDE's sustainedperformancerunningidenticalOpenCLsource code across a wide range of many-­core and multi-­core devices.	Performance	is measured across a complete application run. (Courtesy Morgan Kaufmann)

BUDE’s sustained performance running identical OpenCL source code across a wide range of many-­core and multi-­core devices. Performance is measured across a complete application run. (Courtesy Morgan Kaufmann)

Chapter Authors

Simon Mcintosh-Smith

Simon Mcintosh-Smith

Simon McIntosh-­‐Smith leads the HPC research group at the University of Bristol in the UK. His background is in microprocessor architecture, with a 15 year career in industry at companies including Inmos, STMicroelectronics, Pixelfusion and ClearSpeed. McIntosh-­‐Smith co-­‐founded ClearSpeed in 2002 where, as Director of Architecture and Applications, he co-­‐developed the first modern many-­‐core HPC accelerators. In 2003 he led the development of the first accelerated BLAS/LAPACK and FFT libraries, leading to the creation of the first modern accelerated Top500 system, TSUBAME-­‐1.0 at Tokyo Tech in 2006. He joined the University of Bristol in 2009 where his research focuses on many-­‐core algorithms and performance portability, and fault tolerant software techniques to reach Exascale. He is a joint recipient of an R&D 100 award for his contribution to Sandia’s Mantevo benchmark suite, and in 2014 he was awarded the first Intel Parallel Computing Center in the UK. McIntosh-­‐Smith actively contributes to the Khronos OpenCL heterogeneous many­‐core programming standard.  

Tim Mattson

Tim Mattson

Tim Mattson is a principle engineer in Intel’s Microprocessor and Programming Research laboratory. He is an old fashioned application programmer with experience in quantum chemistry, seismic signal processing, and molecular modeling and has used more parallel programming models than he can keep track of. Tim was part of the teams that created OpenMP and OpenCL. Most recently, he has been working on the memory and execution models for the next major revision of OpenCL (OpenCL 2.0). Tim has published extensively including the books Patterns for Parallel Programming (with B. Sanders and B. Massingill, Addison Wesley, 2004), An Introduction to Concurrency in Programming Languages (with M. Sottile and C. Rasmussen, CRC Press, 2009), and the OpenCL Programming Guide (with A Munshi, B. Gaster, J. Fung, and D. Ginsburg, Addison Wesley, 2011).

Click to see the overview article “Teaching The World About Intel Xeon Phi” that contains a list of TechEnablement links about why each chapter is considered a “Parallelism Pearl” plus information about James Reinders and Jim Jeffers, the editors of High Performance Parallelism Pearls.

Share this:

  • Twitter

Filed Under: Featured article, Featured news, News, News, News, OpenCL, Xeon Phi Tagged With: HPC, Intel, Intel Xeon Phi, OpenCL, x86

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Tell us you were here

Recent Posts

Farewell to a Familiar HPC Friend

May 27, 2020 By Rob Farber Leave a Comment

TechEnablement Blog Sunset or Sunrise?

February 12, 2020 By admin Leave a Comment

The cornerstone is laid – NVIDIA acquires ARM

September 13, 2020 By Rob Farber Leave a Comment

Third-Party Use Cases Illustrate the Success of CPU-based Visualization

April 14, 2018 By admin Leave a Comment

More Tutorials

Learn how to program IBM’s ‘Deep-Learning’ SyNAPSE chip

February 5, 2016 By Rob Farber Leave a Comment

Free Intermediate-Level Deep-Learning Course by Google

January 27, 2016 By Rob Farber Leave a Comment

Intel tutorial shows how to view OpenCL assembly code

January 25, 2016 By Rob Farber Leave a Comment

More Posts from this Category

Top Posts & Pages

  • MultiOS Gaming, Media, and OpenCL Using XenGT Virtual Machines On Shared Intel GPUs
  • High Performance Ray Tracing With Embree On Intel Xeon Phi
  • Intel Xeon Phi Study Guide
  • Free Intermediate-Level Deep-Learning Course by Google

Archives

© 2025 · techenablement.com