The Khronos™ Group today announced the ratification and public release of the OpenCL™ 2.1 provisional specification viewable at www.khronos.org/opencl/ so developers and implementers can provide feedback before finalization at the OpenCL forums. Comments can be made via https://www.khronos.org/opencl/opencl_feedback_forum. The OpenCL 2.1 C++ kernel language is a static … [Read more...]
OpenCL SPIR Tutorial Teaches Portability Without Shipping Kernel Source
Intel has released an OpenCL tutorial showing how developers can use SPIR (Standard Portable Intermediate Representation) to preserve vendor and device portability without having to ship OpenCL kernel source code. For more information about how SPIR enables commercial OpenCl applications, see our article, "Commercial OpenCL! SPIR 2.0 Protects IP Yet Allows Powerful, Portable, … [Read more...]
OpenCL Programmed FPGAs Claim a 3X Performance-to-Power Advantage at Microsoft
The Microsoft white paper, "Accelerating Deep Convolutional Neural Networks Using Specialized Hardware" describes an OpenCL programmed implementation of Convolutional Neural Networks (CNNs) that touts a conservative estimate of 3x the performance-to-power advantage over NVIDIA GPUs when running on new FPGA hardware. Doug Berger posted on the Inside Microsoft Research … [Read more...]
NP-Complete Parallel Thread Placement Addressed in Milliseconds via MIT “Best Paper” Heuristic
The problem of jointly allocating computations and data is a known NP-hard problem. A heuristic proposed by MIT researchers Nathan Beckmann, Po-An Tsai, and Daniel Sanchez recently the best-paper award at the IEEE Symposium on High-Performance Computer Architecture for a place-and-route algorithm that runs in milliseconds and finds a solution that is more than 99 percent as … [Read more...]
Multiple OpenACC Hackathons Scheduled Around the World
OakRidge National Laboratory has announced three GPU Hackathons for 2015. The first will be hosted April 20-24 by the National Center for Supercomputing Applications on the UIUC Campus. The second will be hosted by the Swiss National Supercomputing Centre in Lugano, Switzerland from July 6-10. The final one will be hosted by the Oak Ridge Leadership Computing Facility on … [Read more...]
Guide to Get Ubuntu 14.10 Running Natively on Nvidia Shield Tablet
The XDA forum developer Bogdacutu reports they have Ubuntu 14.10 running on an NVIDIA Shield Tablet and have provided a guide so others can also get Ubuntu running on their Shield tablets. This is wonderful news and brings hope that Ubuntu will soon be running on the rumored Tegra X1 refresh of the Shield Tablet. What works: GPU acceleration (OpenGL … [Read more...]
Horst Simon Explains the HPC Slowdown (and Human Brain Scale Simulation)
HPC luminary Horst Simon, the Berkeley Lab Deputy Director, gave a marvelous talk at an HPC meetup event in San Francisco on Feb 10 covering power efficiency and the movement towards exascale computing. Horst presented data and the conclusion that the June 2008 - June 2013 five-year span marks a turning point where the growth attributed to Moore’s law and parallelism are … [Read more...]
Tutorial on the OpenCL 2.0 Generic Address Space
Adam Lake and Robert Ioffe posted a nice tutorial on the Intel website about the new OpenCL 2.0 generic address space. The OpenCL 2.0 generic address space makes writing OpenCL programs easier by removing the requirement of decorating all pointers with a points to address space. Instead, OpenCL programmers just use pointers as they would in standard C. Utilizing this new … [Read more...]
Power Profiling Shows Simple Changes To Save Megawatts of Power On Leadership Supercomputers
A challenge with profiling applications lies in how to interpret the profile results. In particular, most programmers do not give the power profile plots more than a cursory glance. Following is an example waterfall plot showing the power utilization for an NWChem run on Intel Xeon Phi coprocessors: My recent column in Scientific Computing, "Using Profile Information for … [Read more...]
Preparing For Knights Landing – Stay in HBM Memory
NERSC published an informative preparatory article for programming the forthcoming Cori supercomputer that notes each Intel Xeon Phi “Knight’s Landing” (KNL) devices will be running in a “self-hosted” mode, meaning that there will be no host/traditional processor. Everything - including the operating system - will run on KNL. This eliminates concerns about data movement as … [Read more...]









