Update 12/1/14: Intel now offers through the Xen project full GPU virtualization for Intel 4th generation devices. Operating system virtualization is a convenient way to run multiple operating systems at the same time, on the same hardware, without requiring rebooting. There are several technologies that allow sharing of the GPU by both the host (native) and guest … [Read more...]
May 2014 Current K1 Development Pathways
NVIDIA Tegra K1 Jetson development kits are now available for purchase from Newegg or Microcenter. The NVIDIA Tegra K1 chip has generated much interest due to the CUDA programmability and power efficiency of the ARM/Kepler ceepee-geepee combination. Upcoming Tegra K1 devices include the Xiaomi MiPad, NVIDIA's reference design tablet, plus the K1 powered Shield 2 gaming device. … [Read more...]
Calling All Authors for an Intel Xeon Phi Gems Book – Rough Submissions Due May 29, 2014
All are invited to contribute to High Performance Parallelism Gems – Successful Approaches for Multicore and Many-core Programming (working title) a contribution-based book that will focus on practical techniques for Intel Xeon processor and Intel Xeon Phi coprocessor parallel computing. Submissions<http://lotsofcores.com/gems> are due by May 29, 2014 in order to … [Read more...]
Firefox WebCL plugin, WebCL Security, and Compliance Tests
Interest in WebCL is expanding as exemplified by the Nokia WebCL project that has released a Firefox plugin to run WebCL apps. Developers now have a choice of running WebCL in Chrome via AMD and Firefox with the Nokia plugin. (Firefox, Chrome and Safari all have some form of WebCL support.) The continued expansion of WebCL proof-of-concept … [Read more...]
The Missing Link in NVlink, or “Hello Pascal” bye-bye PCI bus limitations!
Say hello to NVlink, a new technology by NVIDIA that is not constrained by PCIe bandwidth and latency limitations, but you will have to wait for the Pascal generation of 2016 GPUs to get it. NVlink is NVIDIA's properitary "DRAM speed and latency" class interface for CPU to GPU and GPU to GPU point-to-point communications. The basic building block for NVLink is a high-speed, … [Read more...]
OpenCL Haswell Iris 5200 Performance Results – 800 GF/s Peak Performance
The Intel Haswell chip contains an integrated GPU that delivers significantly better OpenCL performance than an NVIDIA GeForce GT 650M - exceeding 800 GF/s peak performance. Allan MacKinnon at PixelIO has been investigating the OpenCL performance of this device and has been finding a plethora of on-gpu registers but also that the GPU appears to be both power and thermally … [Read more...]
PGI 14.4 is now released with lots of OpenACC C++ Goodness!
PGI 14.4 is now released with lots of OpenACC C++ goodness. Give it a try! Here is the link for or those with existing licenses. If need be, get a 15 day trial license and use some of my OpenACC tutorials. PGI Trial keys Trial license keys are used for evaluating PGI software. They are valid for fifteen days. If you haven't already done so, you … [Read more...]
The CUDA Thrust API Now Supports Streams and Concurrent Tasks
The CUDA Thrust API now supports streams and concurrent kernels through the use of a new API called Bulk created by Jared Hoberock at NVIDIA. The design of Bulk is intended to extend the parallel execution policies described in the evolving Technical Specification for Parallel Extensions for C++ N3960. Note that bulk is not part of the CUDA 6.0 distribution and must be … [Read more...]
NVIDIA HBAO+ and TXAA Enhanced Gaming Video
A fun video showing the progress being made in near photo realistic gaming imagery. THe big news in this video is the use of HBAO+ (for ambient occlusion) and TXAA (anti-aliasing) technologies. I imagine such video platforms can be used for small studio animation projects as well. The Watchdog game highlighted in the video will be released May 27, … [Read more...]
OpenCL + Java Acceleration on Mobile Promises 8x speedup with 3x Less Power
In what will certainly become a flood of papers about GPU acceleration of Java applications on mobile devices, a masters theses by Iype P. Joseph at the University of Ottawa claims 8x performance gains and 3x reductions in power consumption through the use of Java binding with OpenCL 1.1 on a a Freescale i.MX6Q SabreLite board. With NVIDIA entering the programmable mobile GPU … [Read more...]








