• Home
  • News
  • Tutorials
  • Analysis
  • About
  • Contact

TechEnablement

Education, Planning, Analysis, Code

  • CUDA
    • News
    • Tutorials
    • CUDA Study Guide
  • OpenACC
    • News
    • Tutorials
    • OpenACC Study Guide
  • Xeon Phi
    • News
    • Tutorials
    • Intel Xeon Phi Study Guide
  • OpenCL
    • News
    • Tutorials
    • OpenCL Study Guide
  • Web/Cloud
    • News
    • Tutorials
You are here: Home / News / A Try-Before-You-Code Linear Regression Method Claims 32% Error Predicting GPU Perf

A Try-Before-You-Code Linear Regression Method Claims 32% Error Predicting GPU Perf

August 23, 2014 by Rob Farber Leave a Comment

The paper, “Estimating GPU Speedups for Programs Without Writing a Single Line of GPU Code” by Newsha Ardalani, Karthikeyan Sankaralingam, Xiaojin Zhu at the University of Wisconsin Madison claims a linear regression model can deliver  a robust “automated tool that programmers can use to estimate potential GPU speedup before writing any GPU code”. According to their study a linear-regression model predicted GPU speedups with an average weighted error of 32% when applied to a cross-validation set of test data selected randomly from Rodinia, Parboil, Lonestar and Parsec benchmark suites (speedup range of 5.9× to 276×). Their model does not take into account the overhead for data transfers across the PCIe bus and is not reliable when the application utilizes specialized graphics-related hardware like interpolation in texture memory. The authors believe this work can be extended to predicting power utilization and to investigate performance improvements for new architectures.

A challenges with the approach, as the authors’ note, are subtle and include:

  • The preparation of a reasonably representative training data by creating CPU and GPU code to train the model.
  • The important features are not know a priori and the speedup function has many variables that influence it.
  • From a practical perspective it is not possible to build large training sets with 20× programs compared to the number of features, which is generally considered an acceptable number for a good linear regression model.

The take-away (and YMMV) is that presumably well-written code for massively-parallel devices like GPUs can achieve some form of linear or polynomial performance relative to a CPU.

Share this:

  • Twitter

Filed Under: News Tagged With: GPU, HPC

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Tell us you were here

Recent Posts

Farewell to a Familiar HPC Friend

May 27, 2020 By Rob Farber Leave a Comment

TechEnablement Blog Sunset or Sunrise?

February 12, 2020 By admin Leave a Comment

The cornerstone is laid – NVIDIA acquires ARM

September 13, 2020 By Rob Farber Leave a Comment

Third-Party Use Cases Illustrate the Success of CPU-based Visualization

April 14, 2018 By admin Leave a Comment

More Tutorials

Learn how to program IBM’s ‘Deep-Learning’ SyNAPSE chip

February 5, 2016 By Rob Farber Leave a Comment

Free Intermediate-Level Deep-Learning Course by Google

January 27, 2016 By Rob Farber Leave a Comment

Intel tutorial shows how to view OpenCL assembly code

January 25, 2016 By Rob Farber Leave a Comment

More Posts from this Category

Top Posts & Pages

  • AMD Firepro S9150 5 TF/s Single, 2.5 TF/s Double-Precision GPU and OpenCL 1.2 Support
  • PyFR: A GPU-Accelerated Next-Generation Computational Fluid Dynamics Python Framework
  • IEEE Offers access to Three Free IBM Webinars
  • Author Call for Volume 2 Of High Performance Parallelism Pearls
  • Intel tutorial shows how to view OpenCL assembly code

Archives

© 2026 · techenablement.com