Say hello to NVlink, a new technology by NVIDIA that is not constrained by PCIe bandwidth and latency limitations, but you will have to wait for the Pascal generation of 2016 GPUs to get it. NVlink is NVIDIA's properitary "DRAM speed and latency" class interface for CPU to GPU and GPU to GPU point-to-point communications. The basic building block for NVLink is a high-speed, … [Read more...]
OpenCL Haswell Iris 5200 Performance Results – 800 GF/s Peak Performance
The Intel Haswell chip contains an integrated GPU that delivers significantly better OpenCL performance than an NVIDIA GeForce GT 650M - exceeding 800 GF/s peak performance. Allan MacKinnon at PixelIO has been investigating the OpenCL performance of this device and has been finding a plethora of on-gpu registers but also that the GPU appears to be both power and thermally … [Read more...]
OpenCL + Java Acceleration on Mobile Promises 8x speedup with 3x Less Power
In what will certainly become a flood of papers about GPU acceleration of Java applications on mobile devices, a masters theses by Iype P. Joseph at the University of Ottawa claims 8x performance gains and 3x reductions in power consumption through the use of Java binding with OpenCL 1.1 on a a Freescale i.MX6Q SabreLite board. With NVIDIA entering the programmable mobile GPU … [Read more...]
GTC 2014 Presentations Now Available Online to All
The NVIDIA GTC presentations are now available for all to view at http://www.gputechconf.com/gtcnew/on-demand-gtc.php. Of-course, I recommend my 30 minute presentation, "S4178: Killer-app Fundamentals: Massively-parallel data structures, Performance to 13 PF/s, Portability, Transparency, and more " [pdf][video]. My talk covers: Deep-learning to 13 PF/s on the ORNL … [Read more...]
Proof-of-Concept WebCL Chrome Browser Available from AMD
AMD has been working on implementing WebCL inside a Chrome browser to enable web programmer's access to OpenCL acceleration plus WebCL and WebGL interoperability. (Firefox, Chrome and Safari all have some form of WebCL support.) The following video shows the potential: http://youtu.be/dGD9NpipcrE Hands on experience can be found through the Chromium-WebCL github project, … [Read more...]
OpenCL 2.0 Conformance Test Suite
The adage with OpenCL is "write once - test everywhere" is being addressed by the Khronos organization through the release of the OpenCL 2.0 test suite. The Khronos™ Group today announced the availability of the official conformance test suite for the OpenCL 2.0 specification, making it possible for implementers to certify that their implementations are officially conformant … [Read more...]
WebCL 1.0 specification released
We all know that browser accelerated 3D graphics are coming and that this technology solution - however instantiated - is going to be a tremendous money maker. WebCL is a technology to watch for browser accelerated 3D graphics. The release of the webCL 1.0 specification is the latest evolution in the Khrnos effort to bring 3D browser acceleration to the Internet. WebCL 1.0 … [Read more...]
TechEnablement Adds Study Guides for CUDA, OpenACC, OpenCL, and Intel Xeon Phi
Today techEnablement.com has provided study guides to help students "learn to change the world" with supercomputing for the masses . The study guides cover: CUDA OpenACC OpenCL Intel Xeon Phi … [Read more...]
Intel Releases OpenCL™ 1.2 Support for Xeon Phi™ Coprocessors
The Intel press room announced that OpenCL support is now available (link). The new SDK broadens options for developers on Intel® architecture and includes tools, optimization guides and training. The SDK helps OpenCL developers improve performance and efficiency on Intel® Xeon Phi™ coprocessors and Intel® Xeon® processors For those interested in using OpenCL to program … [Read more...]
Part 1: OpenCL™ – Portable Parallelism
This first article in a series on portable multithreaded programming using OpenCL™ briefly discusses the thought behind the standard and demonstrates how to download and use the ATI Stream software development kit (SDK) to build and run an OpenCL program. view at The Code Project (http://www.codeproject.com/Articles/110685/Part-OpenCL-Portable-Parallelism) The thought … [Read more...]








