• Home
  • News
  • Tutorials
  • Analysis
  • About
  • Contact

TechEnablement

Education, Planning, Analysis, Code

  • CUDA
    • News
    • Tutorials
    • CUDA Study Guide
  • OpenACC
    • News
    • Tutorials
    • OpenACC Study Guide
  • Xeon Phi
    • News
    • Tutorials
    • Intel Xeon Phi Study Guide
  • OpenCL
    • News
    • Tutorials
    • OpenCL Study Guide
  • Web/Cloud
    • News
    • Tutorials
You are here: Home / Rob Farber

Rob Farber

Rob Farber (CEO/Publisher)

A recognized global technology leader (US, Europe, Middle East, Asia-Pacific) who is highly-rated, highly visible, and in-demand as a technology/scientific advisor, teacher, and collaborator. Proven individual who has demonstrated a strong creative ability in establishing the field of machine learning while a scientist in the Theoretical Division at Los Alamos National Laboratory and while acting as an author, senior scientist, principal investigator, corporate leader, technical architect, and media consultant. He has contracts with fortune 100 companies (Intel, NVIDIA, AMD, and others), research organizations (DARPA, ONR, Cold Spring Harbor and others) plus numerous small companies. Co-founder of a computational drug discovery and computer manufacturing company that achieved liquidity events. Rob also has an extensive background in research (the theoretical division at LANL, external faculty at the Santa Fe Institute, NERSC in Berkeley and PNNL in Washington state). Widely-read and referenced in both the media and scientific literature with a substantial record encompassing hundreds of publications covering a range of fields including peer-reviewed scientific research, machine learning, commercial development, plus the design and implementation of advanced computer software and hardware. Notable projects span the gamut from a 13 PF/s average sustained “deep-learning” performance on the ORNL Titan GPU-based supercomputer; the design and analysis of the NERSC global unified file system architecture; planning metrics for supercomputer procurements; massively-parallel online social media analysis; a commercial drug design supercomputer and software system; a linearly scalable enterprise distributed file system with quality of service guarantees for streaming media (video and audio); a real time 1015 byte/s pattern recognition particle tracker for the SSC; a six billion agent epidemic modeling system; a highly efficient neural network compiler and nonlinear optimization system; plus an active participant in leading-edge technology from producing an early virtual memory microcomputer to working with present day latest-generation massively parallel processors and storage technology.

Books

The English version of my book, “CUDA Application Design and Development” can be purchased from booksellers or on-line from vendors around the world. Here is an Amazon link. A Chinese version of my book can also be purchased (Amazon link ).

Click on image to view or purchase

My newest book is also available for purchase in English and Chinese versions.

Click on image to view or purchase

Publications

Invited chapters

Chapter 25 “Power Analysis for Applications and Data Centers, “Volume 2 High Performance Parallelism Pearls“, Morgan Kaufmann; 1 edition (August 6, 2015)

Editor, “High Performance Parallel Programming“, Morgan Kaufmann; 1 edition (November 17, 2014)

Chapter 7, “Deep-learning and Numerical Optimization” in “High Performance Parallel Programming“, Morgan Kaufmann; 1 edition (November 17, 2014)

Scientific Editor, “GPU Computing Gems Emerald Edition (Applications of GPU Computing Series)”, Morgan-Kaufman, (2011), ISBN-13: 978-0123849885.

“Bioinformatics: High Performance Parallel Computer Architectures“, CRC Press

“Handbook of Research on Computational Science and Engineering: Theory and Practice“, IGI Global

“Multi-Threaded Architectures: Evolution, Costs, Opportunities“, IGI Global

“Topical perspective on massive threading and parallelism“,  The Journal of Molecular Graphics and Modelling

Peer-Reviewed Publications

“Sociolect-Based Community Detection”, William N. Reynolds, William J. Salter, Robert M. Farber, Courtney Corley, Chase P. Dowling, William O. Beeman, Lynn Smith-Loving and Joon Nak Choi, Proceedings of the IEEE International Conference on Intelligence and Security Informatics, 2013.

“High-performance the CUDA Application Design and Development: Methods and Best Practices”, (Chinese Edition) 2013, ISBN-13: 978-7111404460, R. Farber

“Thought Leaders During Crises in Massive Social Networks”, Courtney D. Corley, Robert M. Farber, William N. Reynolds, in Statistical Analysis and Data Mining edited by Joseph Verducci, Wiley Periodicals, Inc., 2012.

“CUDA Application Design and Development” Morgan-Kaufman, (2011), ISBN-13: 978-0123884268, R. Farber.

“Multi-Threaded Architectures: Evolution, Costs, Opportunities” in “Computational Science and Engineering: Theory and Practice”, IGI Press 2011, I. Girotto and R. Farber, ISBN13: 9781613501160.

Scientific Editor, “GPU Computing Gems Emerald Edition (Applications of GPU Computing Series)”, Morgan-Kaufman, (2011), ISBN-13: 978-0123849885.

“Topical perspective on massive threading and parallelism”, J Mol Graph Model. 2011 Sep;30:82-9. doi: 10.1016/j.jmgm.2011.06.007. Epub 2011 Jun 29, Farber RM.

“Massive Social Network Analysis: Mining Twitter for Social Good”, Ediger, et. al. Proceedings of the 39th International Conference on Parallel Processing 2010

“Social and Social Reality Theory, Evidence and Validation”, Farber, et. al. Proceedings of the IEEE International Conference on Intelligence and Security Informatics, 2010

“Experimental Comparison of Emulated Lock-free vs. Fine-grain Locked Data Structures on the Cray XMT”, Rob Farber and David W. Mizell, MTAAP’10, April, 2010

“An Introduction to GPGPUs and Massively-threaded Programming”, Farber, R., “Emerging Parallel Architectures and Programming Models,” edited by Bertil Schmidt, Francis and Taylor, LLC, July 2010, ISBN 10: 1439814880.

“Massively Parallel Near-Linear Scalability Algorithms with Application to Unstructured Video Analysis”, Robert Farber and Harold Trease, Proceedings of TeraGrid 2008, June 2008.

“Unstructured Data Analysis of Streaming Video Using Parallel, High-Throughput Algorithms”, Trease HE, T Carlson, R Mooney, R Farber, and LL Trease. 200, Proceedings of the Ninth Iasted International Conference on Signal and Image Processing, (2007) pp. 305-310. Acta Press, Anaheim, CA.

“Balancing Computation and Experiment” Farber, R., Innovation: America’s Journal of Technology Commercialization, vol. 5 no. 24, (April/May 2007), pg 24+.

“Exploring Protein Sequence Space Using Knowledge Based Potentials”, Aderonke Bajide, Robert Farber, Ivo L. Hofacker, Jeff Inman, Alan S. Lapedes, and Peter F. Stadler, Journal of Theoretical Biology 212(1). (Sept. 7, 2001), p.35-46.

“The Geometry of Shape Space: Application to Influenza” Lapedes, A; Farber, R., Journal of Theoretical Biology (Sep 7 2001). p. 57-69.

“Covariation of Mutations in H3N2 (HA1) Influenza Sequences”, Holly Tao, Robert Farber, Alan Lapedes, LANL internal paper, (1999).

“Following Influenza’s Jet Stream” 10/15/98 Chicago Tribune article referencing influenza work. Follow-up articles appeared in The New Mexican and The Albuquerque Journal.

“A Mutual Information Analysis of tRNA sequence and modification patterns distinctive of species and phylogenic domain”, Francisco M. De La Vega, Carlos Cerpa, Gabriel Guarneros and Robert M. Farber, “Biocomputing: Proceedings of the 1996 Pacific Symposium,” edited by Lawrence Hunter and Teri Klein, World Scientific Publishing Co, Singapore, 1996.

“Use of Adaptive Networks to Define Highly Predictable Protein Secondary-Structure Classes”, Alan S. Lapedes, Evan Steeg, Robert Farber, 21, 103-124, 1995, Machine Learning, Kluwer Academic Publishers, Boston.

“Neural Net Representations of Empirical Protein Potentials”, Tal Grossman, Alan Lapedes, Robert Farber, Evan Steeg, Santa Fe Institute Working Papers (96-05-029), 1995.

“Neural Network Definitions of Highly Predictable Protein Secondary Structure Classes”, Alan Lapedes, Robert Farber, Evan Steeg, LA‑UR 94‑110, 1995.

“Learning Affinity Landscapes: Prediction of Novel Peptides”, Alan Lapedes and Robert Farber, Los Alamos National Laboratory Technical Report LA-UR-94-4391 (1994).

“Global Bifurcations in Rayleigh‑Benard Convection: Experiments, Empirical Maps and Numerical Bifurcation Analysis“, I. G. Kevrekidis, R. Rico‑Martinez, R. E. Ecke, R. M. Farber and A. S. Lapedes, LA‑UR 92‑4200, Physica D, 1993.

“Identification of Continuous‑Time Dynamical Systems: Neural Network Based Algorithms and Parallel Implementation“, R. M. Farber, A. S. Lapedes, R. Rico‑Martinez and I. G. Kevrekidis, Proceedings of the 6th SIAM Conference on Parallel Processing for Scientific Computing, Norfolk, Virginia, March 1993.

“Covariation of Mutations in the V3 Loop of HIV‑1: An Information Theoretic Analysis”, Bette T.M. Korber, Robert M. Farber, David H. Wolpert, and Alan S. Lapedes, P.N.A.S. 1993.

“Efficiently Modeling Neural Networks on Massively Parallel Computers”, Los Alamos National Laboratory Technical Report LA‑UR‑92‑3568.

“A Parallel Non‑Neural Trigger Tracker for HEP Colliders”, R.M. Farber, W. Kinnison, and A.S. Lapedes, Proceedings of the Erice Conference on Pattern Recognition for High Energy Physics, Erice, Sicily (1992).

“Determination of Eukaryotic Protein Coding Regions Using Neural Networks and Information Theory”, R.M. Farber, Alan Lapedes, and Karl Sirotkin, J. Mol. Biology (1992) 226, 471‑479.

“A Parallel Non‑Neural Trigger Tracker for the SSC”, R.M. Farber, W. Kinnison, and A.S. Lapedes, IJCNN‑91‑Seattle Conference, (1991). (also: LANL technical report LA‑UR‑91‑607 and a 1995 Santa Fe Institute working paper 95-02-012).

“Efficiently Modeling Neural Networks on Massively Parallel Computers”, NASA workshop on parallel computing, (November 1991).

“Use of Neural Nets for DNA Sequence Analysis: Results on Combined Donor/Acceptor Splice Site Recognition”, K. Sirotkin, R. Farber, and A. Lapedes, Symposium on Artificial Intelligence and Molecular Biology (Stanford University) (1990).

“Neural Networks as Statistics”, A. Lapedes, R.M. Farber, Proceedings of the 10th International Biophysical Congress (Vancouver, Canada) (1990).

“Nonlinear Signal Processing and System Identification: Applications To Time Series From Electrochemical Reactions”, R.A. Adomaitis, R.M. Farber, J.L. Hudson, I.G. Kevrekidis, M. Kube, A.S. Lapedes, Chemical Engineering Science, ISCRE‑11, (1990).

“Application of Neural Nets to System Identification and Bifurcation Analysis of Real World Experimental Data“, R.A. Adomaitis, R.M. Farber, J.L. Hudson, I.G. Kevrekidis, M. Kube, A.S. Lapedes, International Conference on Neural Networks Proceedings, Lyons France (1990).

“Applications of Neural Net and Other Machine Learning Algorithms to DNA Sequence Analysis”, A.S. Lapedes, C. Barnes, C. Burks, R.M. Farber, K. Sirotkin, Computers and DNA, SFI Studies in the Sciences of Complexity, vol. VII, Eds. G. Bell and T. Marr, Addison‑Wesley, (1989).

“How Neural Nets Work,” Neural Information Processing Systems, Proceedings of IEEE 1987 Denver Conference on Neural Networks, A.S. Lapedes, R.M. Farber. (D.Z. Anderson, editor), (1988).

“How Neural Nets Work“, A.S. Lapedes, R.M. Farber, reprinted in Evolution, Learning, Cognition, and Advanced Architectures, World Scientific Publishing. Co., (1987).

“Nonlinear Signal Processing Using Neural Networks, Prediction and System Modeling”, A.S. Lapedes, R.M. Farber, LANL Technical Report, LA‑UR‑87‑2662, (1987).

“A Theory of Stochastic Neural Activity,” J.D. Cowan, R.M. Farber, A.S. Lapedes, D. H. Sharp, Rev. Mod. Phys.

“Programming a Massively Parallel System: Static Behavior,” A.S. Lapedes, R.M. Farber, Proceedings of the First Snowbird Conference on Neural Nets and Computation, September 1986, A.I.P. Press.

“Collective Arithmetic”, A.S. Lapedes, R.M. Farber, Los Alamos Technical Report.

“A Self Optimizing Neural Net for Content Addressable Memory and Pattern Recognition,”, A.S. Lapedes, R.M. Farber, Physica D,247 (1986).

“32-bit ‘Megamicro’ exploits hardware virtual memory and ‘RAM disk’”, Stan Metcalf, and Robert M. Farber, Mini-micro systems, October (1983).

Recent presentations and interviews:

  • Supercomputing 2014, “From ‘Hello World’ to Exascale Using x86, GPUs and Intel Xeon Phi Coprocessors” (tut106s1)
  • GTC 2014 “S4178 – Killer-App Fundamentals: Massively-Parallel Data Structures, Performance to 13 PF/s, Portability, Transparency, and More”

  • Keynote: Accelerate Insights on the Future of Big Data

  • WEP 2014: “Portable Supercomputing for the Masses, from “Hello World” to Exascale”

  • “Rob Farber on the GTC 2014 Keynote “NVlink, Machine Learning, Pascal, Jetson”

  • GTC 2013 “Simplifying Portable Killer Apps with OpenACC and CUDA-5 Concisely and Efficiently” (video, pdf)

  • GTC 2013 “Clicking GPUs into a Portable, Persistent and Scalable Massive Data Framework” (video,pdf)

  • Rob Farber on OpenACC

  • Rob Farber on The Far reaching implications from GTC 2013

  • “Farber Book on Cuda Serves up Easy Teraflops”

An Intel Xeon Phi tutorial series on Doctor Dobb’s Journal

  • Programming Intel Xeon Phi: A Jumpstart introduction

  • Intel Xeon Phi for CUDA Programmers

  • Getting to 1 teraflop on the Intel phi coprocessor

  • Numerical and Computational Optimization on the Intel Phi

An OpenACC tutorial series “Pragmatic Parallelism” on Doctor Dobb’s Journal

  • Easy GPU Parallelism with OpenACC

  • The OpenACC Execution Model

  • Creating and Using Libraries with OpenACC

A popular 28-part Doctor Dobb’s Journal GPU tutorial series.

  • CUDA, Supercomputing for the Masses: Part 28

    A Massively Parallel Stack for Data Allocation

  • CUDA, Supercomputing for the Masses: Part 27

    A Robust Histogram for Massive Parallelism

  • CUDA, Supercomputing for the Masses: Part 26

    CUDA: Unifying Host/Device Interactions with a Single C++ Macro

  • CUDA, Supercomputing for the Masses: Part 25

    Atomic Operations and Low-Wait Algorithms in CUDA

  • CUDA, Supercomputing for the Masses: Part 24

    Intel’s 50+ core MIC architecture: HPC on a Card or Massive Co-Processor?

  • CUDA, Supercomputing for the Masses: Part 23

    Hot-Rodding Windows and Linux App Performance with CUDA-based Plugins

  • CUDA, Supercomputing for the Masses: Part 22

    Running CUDA Code Natively on x86 Processors

  • CUDA, Supercomputing for the Masses: Part 21

    The Fermi architecture and CUDA

  • CUDA, Supercomputing for the Masses: Part 20

    Parallel Nsight Part 2: Using the Parallel Nsight Analysis capabilities

  • CUDA, Supercomputing for the Masses: Part 19

    Parallel Nsight Part 1: Configuring and Debugging Applications

  • CUDA, Supercomputing for the Masses: Part 18

    Using Vertex Buffer Objects with CUDA and OpenGL

  • CUDA, Supercomputing for the Masses: Part 17

    CUDA 3.0 provides expanded capabilities and makes development easier (2)

  • CUDA, Supercomputing for the Masses: Part 16

    CUDA 3.0 provides expanded capabilities (1)

  • CUDA, Supercomputing for the Masses: Part 15

    Using Pixel Buffer Objects with CUDA and OpenGL

  • CUDA, Supercomputing for the Masses: Part 14

    Debugging CUDA and using CUDA-GDB

  • CUDA, Supercomputing for the Masses: Part 13

    Using texture memory in CUDA

  • CUDA, Supercomputing for the Masses: Part 12

    CUDA 2.2 Changes the Data Movement Paradigm

  • CUDA, Supercomputing for the Masses: Part 11

    Revisiting CUDA memory spaces

  • CUDA, Supercomputing for the Masses: Part 10

    CUDPP, a powerful data-parallel CUDA library  

  • CUDA, Supercomputing for the Masses: Part 9

    Extending High-level Languages with CUDA

  • CUDA, Supercomputing for the Masses: Part 8

    Using libraries with CUDA

  • CUDA, Supercomputing for the Masses: Part 7

    Double the fun with next-generation CUDA hardware

  • CUDA, Supercomputing for the Masses: Part 6

    Global memory and the CUDA profiler

  • CUDA, Supercomputing for the Masses: Part 5

    Understanding and using shared memory (2)

  • CUDA, Supercomputing for the Masses: Part 4

    Understanding and using shared memory (1)

  • CUDA, Supercomputing for the Masses: Part 3

    Error handling and global memory performance limitations

  • CUDA, Supercomputing for the Masses: Part 2

    A first kernel

  • CUDA, Supercomputing for the Masses: Part 1

CUDA lets you work with familiar programming concepts while developing software that can run on a GPU

An OpenCL tutorial series on The Code Project

  • Part 9: OpenCL Extensions and Device Fission

  • Part 8: Heterogeneous workflows using OpenCL

  • Part 7 OpenCL plugins

  • Part 6 Primitive restart and OpenGL interoperability

  • Part 5 OpenCL buffers and memory affinity

  • Part 4 Coordinating Computations with OpenCL Queues

  • Part 3 Work-Groups and Synchronization

  • Part 2 OpenCL Memory Spaces

  • Part 1 OpenCL Portable Parallelism

Regular contributor to the Scientific Computing print and on-line articles (in chronological order):

  • Keynote: Accelerate Insights on the Future of Big Data

  • Create Mobile to HPC Applications using a Single Source Tree

    Basic tools are now available to test efficacy — it’s worth taking a look

  • Preserving Sanity in the Face of Rampant Technology Change

    Portability is the only rational path to follow

  • Mobile Tech between a Rock and a Hard Place

    All signs indicate a healthy continuing demand for technology that can support ever more demanding eye-candy and apps on very high resolution display devices

  • Rob Farber on OpenACC

  • Big Data Requires Scalable Storage Bandwidth

    A perfect storm of opportunities defines what is possible using GPU computing

  • Scalable Storage Solutions for Applied Big Data

    The challenge lies in fitting all the pieces together so they work reliably, at the scale required, while maximizing the potential for future expansion

  • A Terabyte on Your Keychain
    SSDs generate new products, double-digit growth and corporate acquisitions

  • Power Optimization in HPC, Enterprise and Mobile Computing
    Amazing juxtaposition of interests spurs marvelous increase in power efficiency, performance

  • Positioning x86 Petascale Performance with MIC Architecture
    Intel’s reputation is now on the line to demonstrate that Many Integrated Core architecture can compete against GPUs

  • The GPU Performance Revolution
    Intel’s entry into the massively parallel chip market adds fuel to survival-of-the-fittest product evolution

  • Developing a Technology Roadmap for Data-intensive Computing
    The role of group psychology in the transition to massively parallel computing

  • Maximizing MultiGPU Machines
    Multiple GPU and hybrid CPU+GPU performance is heavily dependent upon vendor implementation of the PCIe bus

  • Competing with C++ and Java
    Everyone’s a winner in the race for a common application language that can support both x86 and massively parallel hardware

  • Caching in on Solid-state Storage

    Intelligent use remains the best way to exploit speed and maintain the highest possible ROI

  • Primitive Restart Makes GPGPU Tech Sparkle
    Exploiting the full computational power of the GPGPU to render high-performance, high-quality graphics

  • Seeking Wisdom in the Clouds
    Cloud computing can offer a convenient, “faster, cheaper, greener” option than hardware ownership

  • Multicores and Manycores and GPGPUs, Oh My!
    A fresh look at alternative processor strategies

  • Is GPU Domination Inevitable?
    Predictions for global HPC

  • Redefining what is possible

    The great strength of scalable systems is their ability to make the aggregate bandwidth available so parallel applications can achieve very high performance

  • Racing to Perform World Class Research

    While HPC technology improvements have been spectacular, conventional power-generating capabilities have lagged behind

  • Realizing the Benefits of Affordable Tflop-capable Hardware
    Exciting things are happening with this technology in the hands of the masses

  • Scalable Software for Successful Research
    Intel’s reputation is now on the line to demonstrate that Many Integrated Core architecture can compete against GPUs

  • It’s Not Easy Being Green
    Conventional programming models must adapt to meet the needs of both low-power and highly-scalable hardware

  • Cloud Computing: Pie in the sky?
    Infrastructure offers potentially big changes

  • Opening Minds: The Greatest Architectural Challenge
    Several computer architectural trends provide significant performance benefits

  • Numerical Precision: How Much is Enough?
    As we approach ever-larger and more complex problems, scientists will need to consider this question

  • Validation: Assessing the Legitimacy of Computational Results
    Evaluating the truth and justification of scientific beliefs is an essential part of computation-based science

  • Probing OER’s Huge Potential
    The world needs good teachers — maybe you can help 

  • People Make Petaflop Computing Possible
    The heart of high performance computing technology still resides in the human component

  • HPC – What’s in a Name?
    Making the right Supercomputing Investment

  • HPC’s Future
    What will things be like in 20 years?

  • Back to the Future
    The return of massively parallel systems

  • Storage in Transition
    The one-two technology punch of solid-state memory and RAM can greatly increase usability

  • GPGPUs: Neat Idea or Disruptive Technology?
    General purpose graphics processing units can perform amazingly well when used effectively

  • The Future Looks Bright for Teraflop Computing
    Amazing power in the lab is feasible right now — and for a bargain price — but programming is required

  • Avoid that Bus!
    Multi-core processors drive adoption of new processor interconnect standards

  • Keeping “Performance” in HPC
    A look at the impact of virtualization and many-core processors

  • The Cure for HPC Neurosis: Multiple, Virtual Personalities!
    Virtualization will almost certainly play an important role as we scale out to ever larger clusters

  • The HPC Brick Wall

Power and cooling in a Moore’s Law world

  • Will Your Next Supercomputer Come from Costco?
    A leading-edge architecture for just $600

  • The Victorian-era Child of the 21st Century
    As data management challenges continue to grow, organizations are working to develop new solutions

  • HPC Balance and Common Sense
    Maintain ratios that work and improve on those that don’t

Share this:

  • Twitter

Tell us you were here

Recent Posts

Farewell to a Familiar HPC Friend

May 27, 2020 By Rob Farber Leave a Comment

TechEnablement Blog Sunset or Sunrise?

February 12, 2020 By admin Leave a Comment

The cornerstone is laid – NVIDIA acquires ARM

September 13, 2020 By Rob Farber Leave a Comment

Third-Party Use Cases Illustrate the Success of CPU-based Visualization

April 14, 2018 By admin Leave a Comment

More Tutorials

Learn how to program IBM’s ‘Deep-Learning’ SyNAPSE chip

February 5, 2016 By Rob Farber Leave a Comment

Free Intermediate-Level Deep-Learning Course by Google

January 27, 2016 By Rob Farber Leave a Comment

Intel tutorial shows how to view OpenCL assembly code

January 25, 2016 By Rob Farber Leave a Comment

More Posts from this Category

Top Posts & Pages

  • DARPA Goals, Requirements, and History of the SyNAPSE Project
  • Recovering Speech from a Potato-chip Bag Viewed Through Soundproof Glass - Even With Commodity Cameras!
  • Paper Compares AMD, NVIDIA, Intel Xeon Phi CFD Turbulent Flow Mesh Performance Using OpenMP and OpenCL
  • Lustre Delivers 10x the Bandwidth of NFS on Intel Xeon Phi
  • South Africa Team Wins Their Second Student Supercomputing Competition At ISC14

Archives

© 2025 · techenablement.com