HOPE is an open source specialized method-at-a-time JIT compiler written in Python. It translates Python source code into C++ and compiles the generated code at runtime to achieve a 2x – 120x performance speedup over interpreted Python code. HOPE is published under the GPLv3 license. It can be downloaded from it’s GitHub repository. It was written by Joel Akeret, Lukas Gamper, Adam Amara, Alexandre Refregier (ETH Zurich). See the paper HOPE: A Python Just-In-Time compiler for astrophysical computations for more information.
HOPE is able to translate commonly used unary, binary and comparison operators as well as augmented assign statements. Currently the supported native built-in data types are bool, int and float along with their corresponding NumPy data types (e.g. int32, float32 resp. int64, float64, etcetera). Both scalar values and NumPy arrays with these types are supported. Several features of NumPy and mathematical functions like sin, cos, exp etc. are also supported. HOPE, for instance, allows operations on NumPy arrays with the common slicing syntax (e.g. a[5:, 5:] = x). A set of examples is provided on GitHub. The package can be used with Python 2.7 as well as with Python 3.3+. HOPE has been tested in Linux and Mac OSX environment with gcc and clang compilers.
Phy-node.com also noted other projects to speed Python:
- NumPyPy – an attempt to reimplement all of NumPy in Python and then let PyPy do its meta-tracing magic.
- Numba – one the several projects Travis Oliphant has been cooking up since he started Continuum Analytics. For the most part, Numba’s main purpose is to unbox numeric values and make looping fast in Python. They’re adding support for general-purpose Python constructs, but relying on the traditional Python runtime to implement anything non-numeric, which sequentializes their runtime due to the Global Interpreter Lock.
- Blaze – another Travis Oliphant creation, though this one is even more ambitious than Numba. Whereas NumPy is a good abstraction for dense in-memory arrays with varying layouts, Blaze is intended to work with more complex data types and“is designed to handle out-of-core computations on large datasets that exceed the system memory capacity, as well as on distributed and streaming data”. The underlying abstractions are to a large degree inspired by the Haskell library Repa 3. One key difference between Blaze and NumPy (aside from the much richer array type) is that Blaze delays array computations and then compiles them on-demand.
- Copperhead – Copperhead takes the direct route to parallelism by forcing you to write your code using data parallel operators which have clear compilation schemes onto multicore and GPU targets. To further simplify the compiler’s job, Copperhead forces your code to be purely functional, which goes far against the grain of idiomatic Python. In exchange for these semantics handcuffs, you get some pretty speedy parallel programs. It is unclear if Copperhead is still being developed.
- Theano – Theano takes code that looks like Python but then execute it under different assumptions/semantics. Succinctly, the programmer has to explicitly piece together symbolic expressions representing the computation. In other words, the programmer is writing Theano syntax in Python. As a result, Theano can group and reorganize matrix multiplications, reorder floating point operations for stability, and compute gradients using automatic differentiation. The Theano backend has some preliminary support for CUDA and should eventually add in multi-core and SIMD code generation.
Leave a Reply