The GTC 2015 Keynote by NVIDIA CEO Jen-Hsun Huang showed an intense focus on Deep-Learning through four topic areas: (1) The official Titan X GPU announcement that noted the 7 TF/s SP and 0.2 TF/s DP device has an excellent arithmetic performance mix for deep-learning, (2) The NVIDIA DIGITS (Deep GPU Training System for Data Scientists) software and custom NVIDIA built development box containing four Titan X GPUs to run DIGITS, (3) a re-messaged Tesla roadmap showing Pascal as delivering 10x the performance of Maxwell for deep-learning, and (4) the announcement of Drive PX, a deep-learning development card for the automotive industry based on dual Tegra X1 SoC’s that is claimed 3,000 times faster than a CPU based solution for self-driving cars.
Energy efficiency is a key point for deep-learning algorithms as training sessions can literally require days to weeks of continuous runtime. The Titan X GPU has a TDP (Total Dissipated Power) rating of 250 watts. Thus a 10x performance increase over Maxwell makes Titan X competitive with Microsoft claimed energy efficiencies for OpenCL programmed Convolutional Neural Networks (CNNs). Part of that energy efficency is achieved through the use of FP16 (16-bit floating-point) mixed precision arithmetic. Presumably FP16 is achieved through some of the on-GPU texture unit hardware.Pascal will also provide 750 GB/s of memory performance, which will be a boon for memory bandwidth limited applications.
The approximately 20 watt (based on two Tegra X1 SoC’s) for the Drive PX card shows a remarkable energy efficiency as well. It is likely that the industry will see Tegra X1 training solutions in the near future as machine learning algorithms exhibit strong scaling behavior.
During a follow-on Q&A session, NVIDIA mentioned that they are working with Open Power and IBM to bring NVIDIA deep-learning to the enterprise. However, they were not free to comment more than IBM recently purchased a deep-learning company, AlchemyAPI. NVIDIA CEO Jen Hsun did not mention OpenPower during his keynote.
The Titan X GPU will retail for $999. It contains 12 gigabytes of memory. During the keynote, Jen-Hsun noted that the original RIVA GPU contained 4 million transistors while the Titan X contains 3 billion transistors.
Some interesting statistics provided during the keynote is that NVIDIA currently provides 54 PF/s of floating-point performance in HPC systems. As of 2015, NVIDIA has sold 450,000 Tesla GPUs.
Jen-sun also noted that “VR is making a huge difference in the future of gaming”. Gaming is a massive $100B market and growing.
To emphasize the performance of the Titan X gaming GPU, Jen-Hsun showed a video of the real-time rendering capabilities using the Epic demo covered in the TechEnablement article, “NVIDIA Titan X Powers Games and Virtual Reality“, which looked incredible on the big screen. Those who wish greater double-precision performance might want to consider the Titan Z GPU.
As an aside, Jen-Hsun commented on the extensive use of NVIDIA GPUs in the animation industry and said, ““I’ll make any technology to make Star Wars [movies] faster! Like many of us, Jen-Hsun is a fan.