Facebook has written a Fast Fourier Transform (fbfft) that is 1.5x faster than the NVIDIA CUFFT implementation at sizes 8-64. The paper "Fast Convolutional Nets with fbfft: A GPU Performance Evaluation" discusses the performance increases by changing to a non-zero padded FFT layout (potentially eliminating data copies), the use of autotuning, and clipping to conditionally load … [Read more...]
Automatically Caption Images With Neural Networks and Vector Space Math
Imagine a magic algorithm that can create captions that accurately describe an image. The Google authors of, "Show and Tell: A Neural Image Caption Generator" claim to have created a machine-learning algorithm that approaches human-accuracy. If true, the value is clear as conventional text-based search methods can include relevant images as well as text. machine-translation … [Read more...]
The Unabridged Chapter 1 Introduction To High Performance Parallelism Pearls
Following is the full, unabridged text of the chapter 1 introduction (written by James Reinders) to High Performance Parallelism Pearls. Thanks to Morgan Kaufmann, James Reinders, and Jim Jeffers for giving permission so TechEnablment can make this available. After reading what James wrote, you will see that summarizing the introduction would simply have left out too much … [Read more...]
AI Researchers Talk Up Benefits of GPUs for Deep Learning
With the ability to deliver TF/s to PF/s of performance even on nonlinear problems, deep-learning researchers who participated in the ImageNet competition are espousing the charms of GPU computing technology. At the European Conference on Computer Vision (ECCV), held last week in Zurich, teams from Adobe, U.C. Berkeley, the National University of Singapore, Oxford University … [Read more...]
Deep-Learning Challenge – Google Chose Nevada Self-Driving Car Test Route and Conditions
With the advent of deep-learning, open and impartial validation of complex learning and adaptive systems is becoming ever more important. For example drones and self-driving cars operate in true life-and-death situations where biases in a validation test can result in collisions with people, property, and other vehicles. In my Scientific Computing article, "Validation: … [Read more...]
Programming Deep-learning Neural Networks to Solve Tasks
Deep-learning neural networks can be programmed, or structured by a human to perform one or more complex tasks. The key requirements are the ability to (1) design the network topology and (2) lock weights in the ANN (Artificial Neural Network) during training. A powerful example of structured deep-learning comes from the 1993 Farber, et.al. paper, "Identification of … [Read more...]
GPUs Power Over 90% of ImageNet Deep-Learning Visual Recognition Challenge Entries
Over 90 percent of the participating teams and three of the four winners in the prestigious 2014 ImageNet Large Scale Visual Recognition Challenge used GPUs to enable their deep learning work. Deep learning is a fast-growing segment of machine learning that involves the creation of sophisticated, multi-level or “deep” neural networks. These networks enable powerful … [Read more...]
Dongarra Gives Deep-Learning a Python Interface With RaPyDLI
An NSF-funded project called "Rapid Python Deep Learning Infrastructure", or RaPyDLI received nearly $1 million in NSF grants. The project led by supercomputing luminaries Jack Dongarra (University of Tennessee) and Geoffrey Fox (Indiana University) along with Andrew Ng (Stanford, Baidu and Coursera) will allow users to program deep learning models in Python and port them to … [Read more...]
Deep-Learning Behind Microsoft Cross-Language Real-Time Skype Translator
Deep-learning lies at the heart of the Microsoft Skype translator, a near real-time speech-to-speech machine translation tool that enables voice conversations between individuals speaking different languages. The claim is that a beta version of Skype Translator will be released sometime in 2014. According to Microsoft, machine-learning of large datasets culled from social … [Read more...]
Robobrain.me
The Cornell Robo Brain (http://robobrain.me) is a large-scale learning system that attempts to learn from publicly available Internet resources, computer simulations, and real-life robot trials. The idea is to associate objects in images with text in order to correlate how they relate to human language, behavior and usage. Applications include prototyping for robotics … [Read more...]