• Home
  • News
  • Tutorials
  • Analysis
  • About
  • Contact

TechEnablement

Education, Planning, Analysis, Code

  • CUDA
    • News
    • Tutorials
    • CUDA Study Guide
  • OpenACC
    • News
    • Tutorials
    • OpenACC Study Guide
  • Xeon Phi
    • News
    • Tutorials
    • Intel Xeon Phi Study Guide
  • OpenCL
    • News
    • Tutorials
    • OpenCL Study Guide
  • Web/Cloud
    • News
    • Tutorials
You are here: Home / Archives for CUDA / News

PGI Compiled OpenACC ILP Loop Beats CUDA-7 by 200 GF/s on Deep-learning PCA Example

March 23, 2015 by Rob Farber Leave a Comment

The PGI OpenACC compiler beat the performance of a CUDA 7.0 NVIDIA nvcc compiled deep-learning based PCA (Principal Components Analysis) example by 200 GF/s on a K40c using an ILP (Instruction Level Parallelism) loop structure taught in the TechEnablement classes and forthcoming Farber OpenACC book. PCA is an important data analysis tool utilized by data scientists. Sign up for … [Read more...]

CUDA 7 Released

March 20, 2015 by Rob Farber Leave a Comment

NVIDIA released CUDA 7 for all to use! Download here for Windows, Linux x86, Linux Power 8,  and MacOSX: Productivity and Performance Improvements C++11 support makes it easier for C++ developers to accelerate their applications Write less code with ‘auto’ and ‘lambda’, especially when using the Thrust template library. New cuSOLVER library of dense and sparse direct … [Read more...]

NVIDIA Titan X Powers Games and Virtual Reality

March 5, 2015 by Rob Farber Leave a Comment

NVIDIA CEO Jen-Hsun Huang announced NVIDIA's latest GPU, the Titan X, in a surprise appearance at the 2015 Game Developers Conference. Jen-Hsun claims it is the most powerful GPU on the planet. The announcement followed a presentation by Epic Games' co-founder Tim Sweeney  about the convergence of photorealistic imagery, film, video games, architecture, industrial design, and … [Read more...]

Multiple OpenACC Hackathons Scheduled Around the World

February 23, 2015 by Rob Farber Leave a Comment

OakRidge National Laboratory has announced three GPU Hackathons for 2015. The first will be hosted April 20-24 by the National Center for Supercomputing Applications on the UIUC Campus. The second will be hosted by the Swiss National Supercomputing Centre in Lugano, Switzerland from July 6-10. The final one will be hosted by the Oak Ridge Leadership Computing Facility on … [Read more...]

TACC Accepting Summer Internship Applications

January 28, 2015 by Rob Farber Leave a Comment

TACC is now accepting applications for the 2015 Research Experience for Undergraduates (REU) from June 1 to August 1, 2015. This summer, 10 undergraduate students from across the United States majoring in science and engineering will be immersed in training at UT Austin to become the next generation of ‘game changers.' Participants will explore grand challenges including … [Read more...]

Facebook Open-Sources Torch for Deep-Learning Neural Networks

January 19, 2015 by Rob Farber Leave a Comment

Facebook has made Torch, an open source development environment for numerics, machine learning, and computer vision, with a particular emphasis on deep learning and convolutional nets available to everyone. The latest release includes GPU-optimized modules for large convolutional nets (ConvNets), as well as networks with sparse activations that are commonly used in Natural … [Read more...]

Biadu Small NVIDIA-Powered Cluster for ‘Most Accurate’ Near Human ImageNet Recognition Results

January 16, 2015 by Rob Farber Leave a Comment

Baidu Research utilized a small 36-node NVIDIA-powered cluster to attain the best computer vision ImageNet classification result to date with a 5.98% error vs. GoogleNet's 6.66%. These results are very close to the human error rate of 5.1%. Key to the Baidu performance is their mix of model- and data-parallelism as well as the use of higher-resolution images (512x512 vs … [Read more...]

CUDA 7 For Registered Developers – LAPACK Dense Solvers 3-6x faster than MKL

January 13, 2015 by Rob Farber Leave a Comment

The CUDA Toolkit 7.0 Release Candidate (RC) is now available to members of NVIDIA’s free registered developer program. Especially interesting is the claim of 3-6x faster LAPACK dense solvers over MKL (The Intel Math Kernel Library). C++11 support makes it easier for C++ developers to accelerate their applications Write less code with ‘auto’ and ‘lambda’, especially when … [Read more...]

IPMACC – An Open Source OpenACC to CUDA/OpenCL Translator

December 23, 2014 by Rob Farber Leave a Comment

IPMACC is  a research-grade open-source framework for translating OpenACC source code to CUDA or OpenCL. Binary executables can then be created with OpenCL or CUDA compilers. The authors (Ahmad Lashgar - University of Victoria, Alireza Majidi - Texas A&M University, Amirali Baniasadi - University of Victoria)  verified correctness and performance using benchmarks from … [Read more...]

NVIDIA K80 1.8x Faster and “Highest Energy Efficiency to Date” for Financial Applications

December 17, 2014 by Rob Farber Leave a Comment

STAC, the financial industry benchmarking organization,  released performance testing results on the new NVIDIA Tesla K80 Dual-GPU Accelerator.   In the STAC-A2 benchmark, which helps financial institutions and banks better manage risk, the NVIDIA Tesla K80 GPU set new performance records. The test code only used two threads on the host processor plus the K80 CUDA code was … [Read more...]

« Previous Page
Next Page »

Tell us you were here

Recent Posts

Farewell to a Familiar HPC Friend

May 27, 2020 By Rob Farber Leave a Comment

TechEnablement Blog Sunset or Sunrise?

February 12, 2020 By admin Leave a Comment

The cornerstone is laid – NVIDIA acquires ARM

September 13, 2020 By Rob Farber Leave a Comment

Third-Party Use Cases Illustrate the Success of CPU-based Visualization

April 14, 2018 By admin Leave a Comment

More Tutorials

Learn how to program IBM’s ‘Deep-Learning’ SyNAPSE chip

February 5, 2016 By Rob Farber Leave a Comment

Free Intermediate-Level Deep-Learning Course by Google

January 27, 2016 By Rob Farber Leave a Comment

Intel tutorial shows how to view OpenCL assembly code

January 25, 2016 By Rob Farber Leave a Comment

More Posts from this Category

Top Posts & Pages

  • MultiOS Gaming, Media, and OpenCL Using XenGT Virtual Machines On Shared Intel GPUs
  • Intel Xeon Phi Study Guide
  • High Performance Ray Tracing With Embree On Intel Xeon Phi

Archives

© 2025 · techenablement.com