• Home
  • News
  • Tutorials
  • Analysis
  • About
  • Contact

TechEnablement

Education, Planning, Analysis, Code

  • CUDA
    • News
    • Tutorials
    • CUDA Study Guide
  • OpenACC
    • News
    • Tutorials
    • OpenACC Study Guide
  • Xeon Phi
    • News
    • Tutorials
    • Intel Xeon Phi Study Guide
  • OpenCL
    • News
    • Tutorials
    • OpenCL Study Guide
  • Web/Cloud
    • News
    • Tutorials
You are here: Home / CUDA / ARM64 with CUDA Early Access Boards Now Available

ARM64 with CUDA Early Access Boards Now Available

June 22, 2014 by Rob Farber Leave a Comment

The X-Gene™ ARM64 and CUDA Development Platform for High Performance Computing (HPC) is now available to order from Cirrascale, the US Applied Micro (APM) integration partner. This board represents an entry point for ARM64 + CUDA into the enterprise markets as well.

ARMv8

The X-Gene™ board features custom high-performance ARM v8 processor based on an advanced 64-bit ARM architecture plus network and storage offload engines and integrated Ethernet.

ARM64 + CUDA

The XC-1™ development kit includes APM’s X-Gene™ ARM64 CPU and CUDA GPU.

CUDA Specs:

  • Includes Dynamic parallelism and Hyper-Q for boosted computing performance, higher power efficiency and record application speeds
  • Number and Type of GPU – 1 Kepler GK110
  • Peak double precision floating point performance – 1.17 Tflops
  • Peak single precision floating point performance – 3.52 Tflops
  • Memory bandwidth (ECC off) – 208 GB/sec
  • Memory size (GDDR5) – 5 GB
  • (It looks like this is a PCIe GEN3 x8 solution: Note SoftIron motherboard announcement + X-Gene driver documentation, which notes “X-Gene PCIe controller supports maximum up to 8 lanes and GEN3 speed”.)

 

Availability Next Month

The first GPU-accelerated ARM64 development platforms will be available in July from Cirrascale Corp. and E4 Computer Engineering, with production systems expected to ship later this year. The Eurotech Group also plans to ship production systems later this year. System details include:

  • Cirrascale RM1905D – High-density two-in-one 1U server with two Tesla K20 GPU accelerators; provides high-performance, low total cost of ownership for private cloud, public cloud, HPC, and enterprise applications.
  • E4 EK003 – Production-ready, low-power 3U, dual-motherboard server appliance with two Tesla K20 GPU accelerators, designed for seismic, signal and image processing, video analytics, track analysis, web applications and MapReduce processing.
  • Eurotech – Ultra-high density, energy efficient and modular Aurora HPC server configuration, based on proprietary Brick Technology and featuring direct hot liquid cooling.

NVIDIA is demonstrating new ARM development systems at the International Supercomputing Conference, June 23-26, in booth 230.

According to ZDnet both Applied Micro and Canonical claim the first ARM 64-bit server production software deployment and plan to demo the OpenStack cloud using Ubuntu Linux on an X-Gene server.

Other options

  • Those interested in 32-bit ARM processing with CUDA should check out the Jetson board.
  • Intel has announced a single-chip Xeon + FPGA combination for high-performance low-power servers.
  • Use OpenCL to compare FPGA and GPU power/performance.
  • Cavium provides a 48-core ARM64 solution. (Note the Gigabyte server announcement)Cavium is sharing mainly high-level details of the Thunder SoCs.
  • AMD’s ARM-based Opteron chips, codenamed “Seattle”, started sampling in March. 
  • SoftIron ARM64 motherboard based on X-Gene.

 

Share this:

  • Twitter

Filed Under: CUDA, Featured news, News, News, News, News, News, openacc, OpenCL, Web/Cloud Tagged With: ARM, CUDA, GPU, HPC, NVIDIA, Tegra

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Tell us you were here

Recent Posts

Farewell to a Familiar HPC Friend

May 27, 2020 By Rob Farber Leave a Comment

TechEnablement Blog Sunset or Sunrise?

February 12, 2020 By admin Leave a Comment

The cornerstone is laid – NVIDIA acquires ARM

September 13, 2020 By Rob Farber Leave a Comment

Third-Party Use Cases Illustrate the Success of CPU-based Visualization

April 14, 2018 By admin Leave a Comment

More Tutorials

Learn how to program IBM’s ‘Deep-Learning’ SyNAPSE chip

February 5, 2016 By Rob Farber Leave a Comment

Free Intermediate-Level Deep-Learning Course by Google

January 27, 2016 By Rob Farber Leave a Comment

Intel tutorial shows how to view OpenCL assembly code

January 25, 2016 By Rob Farber Leave a Comment

More Posts from this Category

Top Posts & Pages

  • DARPA Goals, Requirements, and History of the SyNAPSE Project
  • Recovering Speech from a Potato-chip Bag Viewed Through Soundproof Glass - Even With Commodity Cameras!
  • Paper Compares AMD, NVIDIA, Intel Xeon Phi CFD Turbulent Flow Mesh Performance Using OpenMP and OpenCL
  • Lustre Delivers 10x the Bandwidth of NFS on Intel Xeon Phi
  • South Africa Team Wins Their Second Student Supercomputing Competition At ISC14

Archives

© 2025 · techenablement.com