PGI Compiled OpenACC ILP Loop Beats CUDA-7 by 200 GF/s on Deep-learning PCA Example March 23, 2015 by Rob Farber Leave a Comment