Get Started

Get Started

Accelerate is a language for data-parallel array computations embedded within the programming language Haskell. More specifically, it is a deeply embedded language. This means that when you write programs with Accelerate, you are writing a Haskell program using operations from the Accelerate library, but the method by which the program runs is different from a conventional Haskell program. A program written in Accelerate is actually a Haskell program that generates, optimises, and compiles code for the GPU or CPU on-the-fly at program runtime.

To get started you will need to set up a Haskell environment as well as a few external libraries.

1. Setup Haskell & LLVM

Selected operating system: macOS

1.1 GHC

Download and install GHC. The minimal binary distribution contains GHC-8.0 plus build tools such as cabal, and is the generally recommended option over the full install (which additionally includes some pre-installed libraries). Instructions for installing via the Homebrew or MacPorts package managers are also available on that page.

1.2 LLVM

Executing an Accelerate program differs from that of regular Haskell programs. Programs written in Accelerate require both the Accelerate library, which contains the operations of the language we use to write programs, as well as a (or several) backend(s) which will compile and execute the program for a particular target architecture, such as CPUs or GPUs.

The two primary Accelerate backends are currently based on LLVM, a mature optimising compiler targeting several architectures. LLVM is available through both the Homebrew and MacPorts package managers, or can be compiled manually from the source releases found here. If compiling from source be sure to build LLVM with the libLLVM shared library.¹

Example of installing LLVM-4.0 via Homebrew:

brew install libffi
brew install llvm-hs/homebrew-llvm/llvm-4.0

1.3 CUDA (optional)

If you have a CUDA capable NVIDIA GPU (see the list of supported devices) and would like to run Accelerate programs on the GPU, you will need to download and install the CUDA toolkit available here.

2. Install Accelerate

The Haskell ecosystem has two tools to help with building and installing packages: cabal (the default) which installs packages to a global location, and stack, which has a more project-centric focus.

2.1 Using cabal

We can now install the core Accelerate library:

cabal install accelerate

This will install the current stable release of Accelerate from Hackage. If you would like to instead install the latest in-development version, see how to install from GitHub.

This is sufficient to write programs in Accelerate as well as execute them using the included interpreter backend.² For good performance however we also need to install one (or both) of the LLVM backends, which will compile Accelerate programs to native code.

Install a version of the llvm-hs package suitable for the version of LLVM installed in step 1.2. The first two numbers of the version of LLVM and the llvm-hs package must match. We must also install with shared library support so that we can use llvm-hs from within ghci and Template Haskell. Continuing the example above where we installed LLVM-4.0:

cabal install llvm-hs -fshared-llvm --constraint="llvm-hs==4.0.*"

Install the Accelerate LLVM backend for multicore CPUs:

cabal install accelerate-llvm-native

(Optional) If you have a CUDA capable GPU and installed the CUDA toolkit in step 1.3, you can also install the Accelerate backend for NVIDIA GPUs:

cabal install accelerate-llvm-ptx

2.2 Using stack

You can use Accelerate in a stack-based workflow by including the following (or similar) into the stack.yaml file of your project:

resolver: lts-9.0
extra-deps:
- 'accelerate-llvm-1.0.0.0'
- 'accelerate-llvm-native-1.0.0.0'
- 'accelerate-llvm-ptx-1.0.0.0'
- 'cuda-0.7.5.3'
- 'llvm-hs-4.0.1.0'
- 'llvm-hs-pure-4.0.0.0'
flags:
  llvm-hs:
    shared-llvm: true

3. Run an Accelerate program

Copy the following content into a file called Dotp.hs. This simple example computes the dot product of two vectors of single-precision floating-point numbers. If you installed the GPU backend in step 2, you can uncomment the third line (delete the leading --) to enable both the CPU and GPU backends.

import Data.Array.Accelerate              as A
import Data.Array.Accelerate.LLVM.Native  as CPU
-- import Data.Array.Accelerate.LLVM.PTX     as GPU

dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = A.fold (+) 0 (A.zipWith (*) xs ys)

Open up a terminal and load the file into the Haskell interpreter with ghci Dotp.hs.

Create some arrays to feed into the computation. See the documentation for more information, as well as additional ways to get data into the program.
```
ghci> let xs = fromList (Z:.10) [0..]   :: Vector Float
ghci> let ys = fromList (Z:.10) [1,3..] :: Vector Float
```
Run the computation:
```
ghci> CPU.run $ dotp (use xs) (use ys)
Scalar Z [615.0]
```
This will convert the Accelerate program into LLVM code, optimise, compile, and execute it on the CPU. If your computer has multiple CPU cores, you can execute using multiple CPU cores by launching ghci (or running a compiled program) with the additional command line options +RTS -Nx -RTS, to use x CPU cores (or omit x to use as many cores as your machine has).
(Optional) If you installed the accelerate-llvm-ptx backend, you can also execute the computation on the GPU simply by:
```
ghci> GPU.run $ dotp (use xs) (use ys)
Scalar Z [615.0]
```
This will instead convert the Accelerate program into LLVM code suitable for the GPU, optimise, compile, and execute it on the GPU, as well as copy the input arrays into GPU memory and copy the result back into CPU memory.

4. Next steps

Congratulations, you are set up to use Accelerate! Now you are ready to:

Include the build options -DLLVM_BUILD_LLVM_DYLIB=True and -DLLVM_LINK_LLVM_DYLIB=True.↩
Although the core accelerate package includes an interpreter that can be used to run Accelerate programs, its performance is fairly poor as it is designed as a reference implementation of the language semantics, rather than for performance.↩