• Home
  • News
  • Tutorials
  • Analysis
  • About
  • Contact

TechEnablement

Education, Planning, Analysis, Code

  • CUDA
    • News
    • Tutorials
    • CUDA Study Guide
  • OpenACC
    • News
    • Tutorials
    • OpenACC Study Guide
  • Xeon Phi
    • News
    • Tutorials
    • Intel Xeon Phi Study Guide
  • OpenCL
    • News
    • Tutorials
    • OpenCL Study Guide
  • Web/Cloud
    • News
    • Tutorials
You are here: Home / Archives for CUDA / News

NVIDIA Tegra K1 Powered Shield Should Soon Be Available

June 10, 2014 by Rob Farber Leave a Comment

A revised "P2750" NVIDIA Shield gaming device has now appeared in an FCC filing. This suggests that suggests we will soon start seeing a number of NVIDIA Tegra K1 powered devices on store shelves.TechEnablement.com reported some early specifications and benchmark results for a K1-powered Shield that should perform well and can run android or be rooted to … [Read more...]

NVIDIA’s Women Who CUDA Campaign – May 30, 2014 Deadline!

May 19, 2014 by Rob Farber Leave a Comment

On May 8, 2014 NVIDIA launched the Women Who CUDA campaign to highlight the work of innovative women in the area of GPU computing. Winning entries in the CUDA Women survey (click here to enter) - that is open until May 30, 2014, will be published on the high-visibility, high-volume NVIDIA website. Tweets during the campaign will provide visibility in the GPU computing community … [Read more...]

May 2014 Current K1 Development Pathways

May 17, 2014 by Rob Farber Leave a Comment

NVIDIA Tegra K1 Jetson development kits are now available for purchase from Newegg or Microcenter. The NVIDIA Tegra K1 chip has generated much interest due to the CUDA programmability and power efficiency of the ARM/Kepler ceepee-geepee combination. Upcoming Tegra K1 devices include the Xiaomi MiPad, NVIDIA's reference design tablet, plus the K1 powered Shield 2 gaming device. … [Read more...]

The Missing Link in NVlink, or “Hello Pascal” bye-bye PCI bus limitations!

May 13, 2014 by Rob Farber Leave a Comment

Say hello to NVlink, a new technology by NVIDIA that is not constrained by PCIe bandwidth and latency limitations, but you will have to wait for the Pascal generation of 2016 GPUs to get it.  NVlink is NVIDIA's properitary "DRAM speed and latency" class  interface for CPU to GPU and GPU to GPU point-to-point communications. The basic building block for NVLink is a high-speed, … [Read more...]

The CUDA Thrust API Now Supports Streams and Concurrent Tasks

May 7, 2014 by Rob Farber Leave a Comment

The CUDA Thrust API now supports streams and concurrent kernels through the use of a new API called Bulk created by Jared Hoberock at NVIDIA. The design of Bulk is intended to extend the parallel execution policies described in the evolving Technical Specification for Parallel Extensions for C++ N3960. Note that bulk is not part of the CUDA 6.0 distribution and must be … [Read more...]

OpenCL + Java Acceleration on Mobile Promises 8x speedup with 3x Less Power

May 6, 2014 by Rob Farber Leave a Comment

In what will certainly become a flood of papers about GPU acceleration of Java applications on mobile devices, a masters theses by Iype P. Joseph at the University of Ottawa claims 8x performance gains and 3x reductions in power consumption through the use of Java binding with OpenCL 1.1 on a a Freescale i.MX6Q SabreLite board. With NVIDIA entering the programmable mobile GPU … [Read more...]

GTC 2014 Presentations Now Available Online to All

May 5, 2014 by Rob Farber Leave a Comment

The NVIDIA GTC presentations are now available for all to view at http://www.gputechconf.com/gtcnew/on-demand-gtc.php. Of-course, I recommend my 30 minute presentation, "S4178: Killer-app Fundamentals: Massively-parallel data structures, Performance to 13 PF/s, Portability, Transparency, and more " [pdf][video]. My talk covers: Deep-learning to 13 PF/s on the ORNL … [Read more...]

ExaFMM: An Exascale-capable, TF/s per GPU or Xeon Phi, Long-Range Force Library for Particle Simulations

May 1, 2014 by Rob Farber Leave a Comment

Rio Yokota has implemented exaFMM, a Fast Multipole Method library to speed applications that must quickly calculate the effects of long-range forces such as gravity or magnetism on discrete particles in a simulation. Based on work he performed as a post-doc with Lorena Barba, the open-source FMM library runs on GPUs, multicore, and Intel Xeon Phi plus most of the … [Read more...]

Opportunities to Run on Jetson, the Latest Tegras, and ORNL Titan

April 30, 2014 by Rob Farber Leave a Comment

Following Jen-Hsun's strategy to enable those who wish to use NVIDIA chips, developers can win a Jetson K1, get free access to the latest Tegra GPUs. Also those with big computations can submit INCITE proposals to run on the ORNL Titan supercomputer. Ends today (4/30/14) to possibly win a Jetson K1 (link) merely by submitting an idea via … [Read more...]

Run CUDA without Recompilation on x86, AMD GPUs, and Intel Xeon Phi with gpuOcelot

April 28, 2014 by Rob Farber Leave a Comment

Various pathways exist to run CUDA on a variety of different architectures. The freely available gpuOcelot project is unique in that it currently allows CUDA binaries to run on NVIDIA GPUs, AMD GPUs, x86 and Intel Xeon Phi at full speed without recompilation. It works by dynamically analyzing  and recompiling the PTX instructions of the CUDA kernels so they can run on the … [Read more...]

« Previous Page
Next Page »

Tell us you were here

Recent Posts

Farewell to a Familiar HPC Friend

May 27, 2020 By Rob Farber Leave a Comment

TechEnablement Blog Sunset or Sunrise?

February 12, 2020 By admin Leave a Comment

The cornerstone is laid – NVIDIA acquires ARM

September 13, 2020 By Rob Farber Leave a Comment

Third-Party Use Cases Illustrate the Success of CPU-based Visualization

April 14, 2018 By admin Leave a Comment

More Tutorials

Learn how to program IBM’s ‘Deep-Learning’ SyNAPSE chip

February 5, 2016 By Rob Farber Leave a Comment

Free Intermediate-Level Deep-Learning Course by Google

January 27, 2016 By Rob Farber Leave a Comment

Intel tutorial shows how to view OpenCL assembly code

January 25, 2016 By Rob Farber Leave a Comment

More Posts from this Category

Top Posts & Pages

  • Code Modernization Speeds Python and Other Machine Learning Packages
  • IWOCL 2014 (International Workshop on OpenCL) presentations are now online
  • 42 PF/s Trinity Supercomputer to Use Intel Knights Landing
  • Rob Farber
  • Sony Demos Project Morpheus VR Headset on the PS4

Archives

© 2026 · techenablement.com