openacc Archives - Page 6 of 7

NVIDIA App Showcase, See What Performance is Possible

June 13, 2014 by Rob Farber Leave a Comment

Considering utilizing GPUs in your application? The NVIDIA Application Showcase is a great place to examine a broad spectrum of applications that have been GPU accelerated and the speedups that have been achieved. The recently updated list now contains descriptions, links, and performance reports for over 270 GPU accelerated applications. … [Read more...]

Microway Announces OpenACC GPU DevKit

May 29, 2014 by Rob Farber Leave a Comment

Microway announced their new WhisperStation GPU Starter DevKit with OpenACC. The starter kit enables programmers to quickly bring the power of GPU computing with OpenACC. The bundle includes an NVIDIA Tesla K20-equipped WhisperStation and a license for PGI Accelerator C/C++/Fortran Compilers with OpenACC. … [Read more...]

Pragma Puzzler – Ambiguous Loop Trip Count in OpenMP and OpenACC

May 23, 2014 by Rob Farber Leave a Comment

Pragma-based programming can be described as a "negotiation" with the compiler where the compiler has to assume corner-cases that are not apparent to the programmer. So why does the loop count in the OpenMP and OpenACC article, "A First Transparent OpenACC C++ Class" have to be assigned to a separate variable to generate a parallel … [Read more...]

A First Transparent OpenACC C++ Class

May 22, 2014 by Rob Farber Leave a Comment

This article provides a simple yet complete working example that demonstrates how OpenACC 2.0 pragmas can be used in the constructor and destructor of a C++ class to allocate and free memory on both the host and device and to transparently move data in the C++ class to support C++ class methods that run on both the host and device. Key to the transparent use of C++ classes in … [Read more...]

NVIDIA’s Women Who CUDA Campaign – May 30, 2014 Deadline!

May 19, 2014 by Rob Farber Leave a Comment

On May 8, 2014 NVIDIA launched the Women Who CUDA campaign to highlight the work of innovative women in the area of GPU computing. Winning entries in the CUDA Women survey (click here to enter) - that is open until May 30, 2014, will be published on the high-visibility, high-volume NVIDIA website. Tweets during the campaign will provide visibility in the GPU computing community … [Read more...]

MultiOS Gaming CUDA & OpenCL Via a Virtual Machine

May 19, 2014 by Rob Farber Leave a Comment

Update 12/1/14: Intel now offers through the Xen project full GPU virtualization for Intel 4th generation devices. Operating system virtualization is a convenient way to run multiple operating systems at the same time, on the same hardware, without requiring rebooting. There are several technologies that allow sharing of the GPU by both the host (native) and guest … [Read more...]

The Missing Link in NVlink, or “Hello Pascal” bye-bye PCI bus limitations!

May 13, 2014 by Rob Farber Leave a Comment

Say hello to NVlink, a new technology by NVIDIA that is not constrained by PCIe bandwidth and latency limitations, but you will have to wait for the Pascal generation of 2016 GPUs to get it. NVlink is NVIDIA's properitary "DRAM speed and latency" class interface for CPU to GPU and GPU to GPU point-to-point communications. The basic building block for NVLink is a high-speed, … [Read more...]

PGI 14.4 is now released with lots of OpenACC C++ Goodness!

May 9, 2014 by Rob Farber Leave a Comment

PGI 14.4 is now released with lots of OpenACC C++ goodness. Give it a try! Here is the link for or those with existing licenses. If need be, get a 15 day trial license and use some of my OpenACC tutorials. PGI Trial keys Trial license keys are used for evaluating PGI software. They are valid for fifteen days. If you haven't already done so, you … [Read more...]

GTC 2014 Presentations Now Available Online to All

May 5, 2014 by Rob Farber Leave a Comment

The NVIDIA GTC presentations are now available for all to view at http://www.gputechconf.com/gtcnew/on-demand-gtc.php. Of-course, I recommend my 30 minute presentation, "S4178: Killer-app Fundamentals: Massively-parallel data structures, Performance to 13 PF/s, Portability, Transparency, and more " [pdf][video]. My talk covers: Deep-learning to 13 PF/s on the ORNL … [Read more...]

PGI 14.4 Release Contains Much OpenACC C++ Goodness

April 25, 2014 by Rob Farber Leave a Comment

PGI released their 14.4 and upcoming 14.7 OpenACC 2.0 roadmap. The expectation is that we will see the 14.4 release in early May and the 14.7 release in early July. Note: these are not official PGI dates. Analysis: The 14.4 support of atomic operations will enable many low-wait algorithms such as counters and massively parallel stacks. Improved reduction performance in … [Read more...]

« Previous Page