• Home
  • News
  • Tutorials
  • Analysis
  • About
  • Contact

TechEnablement

Education, Planning, Analysis, Code

  • CUDA
    • News
    • Tutorials
    • CUDA Study Guide
  • OpenACC
    • News
    • Tutorials
    • OpenACC Study Guide
  • Xeon Phi
    • News
    • Tutorials
    • Intel Xeon Phi Study Guide
  • OpenCL
    • News
    • Tutorials
    • OpenCL Study Guide
  • Web/Cloud
    • News
    • Tutorials
You are here: Home / Featured article / NP-Complete Parallel Thread Placement Addressed in Milliseconds via MIT “Best Paper” Heuristic

NP-Complete Parallel Thread Placement Addressed in Milliseconds via MIT “Best Paper” Heuristic

February 23, 2015 by Rob Farber Leave a Comment

The problem of jointly allocating computations and data is a known NP-hard problem. A heuristic proposed by MIT researchers Nathan Beckmann, Po-An Tsai, and Daniel Sanchez recently the best-paper award at the IEEE Symposium on High-Performance Computer Architecture for a place-and-route algorithm that runs in milliseconds and finds a solution that is more than 99 percent as efficient as that produced by standard place-and-route algorithms that take hours for a 64-core chip. The paper “Scaling Distributed Cache Hierarchies through Computation and Data Co-Scheduling” reports increased computational speeds in a simulated 64-core chip by 46% and reduced power consumption by 36%.

Overview of the periodic reconfiguration procedure (image courtesy IEEE)

Overview of the periodic reconfiguration procedure (image courtesy IEEE)

While the paper proposes a hardware solution, it is conceivable that the MIT algorithm could also be adapted to help software developers place threads. Previous reports in the literature note that other thread placement algorithms observe similar, significant power savings that can potentially save megawatts on Intel Xeon Phi powered supercomputers.

“There was a big National Academy study and a DARPA-sponsored [information science and technology] study on the importance of communication dominating computation,” says David Wood, a professor of computer science at the University of Wisconsin at Madison. “What you can see in some of these studies is that there is an order of magnitude more energy consumed moving operands around to the computation than in the actual computation itself. In some cases, it’s two orders of magnitude. What that means is that you need to not do that.”

The MIT researchers “have a proposal that appears to work on practical problems and can get some pretty spectacular results,” Wood says. “It’s an important problem, and the results look very promising.”

(Source MIT)

 

 

Share this:

  • Twitter

Filed Under: Featured article, Featured news, News, News, Xeon Phi Tagged With: HPC

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Tell us you were here

Recent Posts

Farewell to a Familiar HPC Friend

May 27, 2020 By Rob Farber Leave a Comment

TechEnablement Blog Sunset or Sunrise?

February 12, 2020 By admin Leave a Comment

The cornerstone is laid – NVIDIA acquires ARM

September 13, 2020 By Rob Farber Leave a Comment

Third-Party Use Cases Illustrate the Success of CPU-based Visualization

April 14, 2018 By admin Leave a Comment

More Tutorials

Learn how to program IBM’s ‘Deep-Learning’ SyNAPSE chip

February 5, 2016 By Rob Farber Leave a Comment

Free Intermediate-Level Deep-Learning Course by Google

January 27, 2016 By Rob Farber Leave a Comment

Intel tutorial shows how to view OpenCL assembly code

January 25, 2016 By Rob Farber Leave a Comment

More Posts from this Category

Top Posts & Pages

  • NCSA (XSEDE) to Host OpenACC Aug 5th Workshop Using Blue Waters - Only Few Sites Can Receive Telecast
  • NVIDIA's CEO Cuts to the Chase at SC15 about Accelerated Computing and Deep-learning
  • Recovering Speech from a Potato-chip Bag Viewed Through Soundproof Glass - Even With Commodity Cameras!
  • SURFsara Achieves Accuracy and Performance Breakthroughs for Both Deep Learning and Wide Network Training
  • PyFR: A GPU-Accelerated Next-Generation Computational Fluid Dynamics Python Framework

Archives

© 2026 · techenablement.com