The NVIDIA GTC presentations are now available for all to view at http://www.gputechconf.com/gtcnew/on-demand-gtc.php. Of-course, I recommend my 30 minute presentation, "S4178: Killer-app Fundamentals: Massively-parallel data structures, Performance to 13 PF/s, Portability, Transparency, and more " [pdf][video]. My talk covers: Deep-learning to 13 PF/s on the ORNL … [Read more...]
Understanding the Rational behind 400 GB Flash-based DIMM Memory
On January 24th, SanDisk announced shipments of ULLtraDIMM SSD storage in concert with an IBM announcement rebranding the SanDisk ULLtraDIMMs as eXFlash DIMMs. On March 21, SanDisk's stocks hit a 14-year high. ULLtraDIMM SSD storage puts Flash memory in a standard DIMM form factor that can be plugged into a memory socket. The Linux, Windows, or VMware UEFI/BIOS … [Read more...]
NERSC to Procure “Cori” a Knights Landing Based Cray XC Supercomputer
Scheduled for delivery in mid-2016, NERSC's next-generation supercomputer, a Cray XC, will be named after Gerty Cori, the first American woman to be honored with a Nobel Prize in science. The Cory supercomputer will use Intel’s next-generation Intel® Xeon Phi™ processor –- code-named “Knights Landing” -- a self-hosted, manycore processor with on-package high bandwidth memory … [Read more...]
ExaFMM: An Exascale-capable, TF/s per GPU or Xeon Phi, Long-Range Force Library for Particle Simulations
Rio Yokota has implemented exaFMM, a Fast Multipole Method library to speed applications that must quickly calculate the effects of long-range forces such as gravity or magnetism on discrete particles in a simulation. Based on work he performed as a post-doc with Lorena Barba, the open-source FMM library runs on GPUs, multicore, and Intel Xeon Phi plus most of the … [Read more...]
Opportunities to Run on Jetson, the Latest Tegras, and ORNL Titan
Following Jen-Hsun's strategy to enable those who wish to use NVIDIA chips, developers can win a Jetson K1, get free access to the latest Tegra GPUs. Also those with big computations can submit INCITE proposals to run on the ORNL Titan supercomputer. Ends today (4/30/14) to possibly win a Jetson K1 (link) merely by submitting an idea via … [Read more...]
GaussianFace: Computers Claimed to Beat Humans in Recognizing Faces
In a human vs. computer test on 13k photos of 6k public figures, the GaussianFace project claims to identify human faces better than humans (97% human accuracy vs. 98% computer accuracy). The authors claim their model can adapt automatically to complex data distributions, and therefore can well capture complex face variations inherent in multiple sources. The reporters at The … [Read more...]
Run CUDA without Recompilation on x86, AMD GPUs, and Intel Xeon Phi with gpuOcelot
Various pathways exist to run CUDA on a variety of different architectures. The freely available gpuOcelot project is unique in that it currently allows CUDA binaries to run on NVIDIA GPUs, AMD GPUs, x86 and Intel Xeon Phi at full speed without recompilation. It works by dynamically analyzing and recompiling the PTX instructions of the CUDA kernels so they can run on the … [Read more...]
Enablement to Save Lives
Take the time to learn to save a life and be an asset to your family and community in the case of disaster! Take a class in first aid or become a CERT (Community Emergency Response Team) memberr. Find the training in your country or community to turn yourself into an asset that can respond without adding to the problem in a disaster situation. The TechEnablement site is … [Read more...]
K1-powered NVIDIA Shield 2 Benchmarks Appear
The good folks at Tom's Hardware are lending credibility to the Antutu benchmarks of a K1 powered NVIDIA Shield 2 (link). It is not surprising that the NVIDIA Shield would be one of the first platforms to contain the newest NVIDIA Tegra chip. The claimed specs for the Shield-2 appear reasonable: A screen resolution of 1440 x 810, 4 GB of RAM 16 GB of internal … [Read more...]
PGI 14.4 Release Contains Much OpenACC C++ Goodness
PGI released their 14.4 and upcoming 14.7 OpenACC 2.0 roadmap. The expectation is that we will see the 14.4 release in early May and the 14.7 release in early July. Note: these are not official PGI dates. Analysis: The 14.4 support of atomic operations will enable many low-wait algorithms such as counters and massively parallel stacks. Improved reduction performance in … [Read more...]









