Many people wish to run Ubuntu, or Ubuntu touch on the recently released NVIDIA Shield Tablet. The first step is to unlock the bootloader, which can be performed according to the instructions by Abdur Rehman in his post, "How To Unlock/Relock Bootloader on NVIDIA Shield Tablet [Guide]". Note that unlocking the bootloader does void the warranty - even if you relock it! Ubuntu … [Read more...]
Intel Broadwell Compute Gen8 GPU Architecture
Attention OpenCL programmers! Intel has released a detailed description of the Gen8 Broadwell GPU compute architecture, "The Compute Architecture of Intel® Processor Graphics Gen8". Broadwell is a 14nm die shrink of Intel’s microarchitecture incorporating significant reworking of the Intel HD 5000-series (Iris) Gen 7.5 GPU found in Haswell including: (1) throughput for 32-bit … [Read more...]
Houston Workshop: Directives and Tools for Accelerators: A Seismic Programming Shift
With space for 70-80 participants, those who wish to attend the FREE University of Houston workshop, "Directives and Tools for Accelerators: A Seismic Programming Shift" must register by October 10th, 2014. The workshop is a fill-day event on October 20th, 2014, with a preceding welcome reception on October 19th, 2014. THis workshop is organized by the HPC Tools group in the … [Read more...]
GPUs Power Over 90% of ImageNet Deep-Learning Visual Recognition Challenge Entries
Over 90 percent of the participating teams and three of the four winners in the prestigious 2014 ImageNet Large Scale Visual Recognition Challenge used GPUs to enable their deep learning work. Deep learning is a fast-growing segment of machine learning that involves the creation of sophisticated, multi-level or “deep” neural networks. These networks enable powerful … [Read more...]
New PyFR Paper “Heterogeneous Computing on Mixed Unstructured Grids with PyFR”
Peter Vincent's original PyFR post on TechEnablement has been extremely popular. Readers should be happy to hear that the PyFR team has published a new paper, "Heterogeneous Computing on Mixed Unstructured Grids with PyFR", showing this Python framework can perform high-order accurate unsteady simulations of flow on mixed unstructured grids using heterogeneous multi-node … [Read more...]
Dongarra Gives Deep-Learning a Python Interface With RaPyDLI
An NSF-funded project called "Rapid Python Deep Learning Infrastructure", or RaPyDLI received nearly $1 million in NSF grants. The project led by supercomputing luminaries Jack Dongarra (University of Tennessee) and Geoffrey Fox (Indiana University) along with Andrew Ng (Stanford, Baidu and Coursera) will allow users to program deep learning models in Python and port them to … [Read more...]
OpenACC Compilers Deliver 85% The Performance Of Hand-Optimized Code
Directive-based compilers offer both portability and the ability to optimized code for specific platforms such as GPUs and CPUs. A recent LCPC14 paper, "Directive-Based Compilers for GPUs", by Swapnil Ghike, Ruben Gran, Maria J. Garzaran, David Padua at the University of Illinois at Urbana-Champaign found OpenACC code generated by the PGI and Cray OpenACC compilers achieved … [Read more...]
CUDA 340.29 Driver Significantly Boosts GPU Performance (100s GF/s For Machine-Learning)
Reports are now coming in about performance boosts that are the result of the CUDA 6.5 production release. The Blender project reports faster rendering time with CUDA-6.5. As can be seen in the graphs below that report performance on the farbopt deep-learning teaching code, CUDA-6.5 with the NVIDIA 340.29 driver have increased performance on linear problems (PCA analysis from … [Read more...]
A Try-Before-You-Code Linear Regression Method Claims 32% Error Predicting GPU Perf
The paper, "Estimating GPU Speedups for Programs Without Writing a Single Line of GPU Code" by Newsha Ardalani, Karthikeyan Sankaralingam, Xiaojin Zhu at the University of Wisconsin Madison claims a linear regression model can deliver a robust "automated tool that programmers can use to estimate potential GPU speedup before writing any GPU code". According to their study a … [Read more...]
PyFR: A GPU-Accelerated Next-Generation Computational Fluid Dynamics Python Framework
PyFR is an open-source 5,000 line Python based framework for solving fluid-flow problems that can exploit many-core computing hardware such as GPUs! Computational simulation of fluid flow, often referred to as Computational Fluid Dynamics (CFD), plays an critical role in the aerodynamic design of numerous complex systems, including aircraft, F1 racing cars, and wind turbines. … [Read more...]








