List of parallel programming patterns
CS 300 (PDC)
Matt Wolff, Georgia Tech and Oak Ridge National Labs:
Problem parallelism, e.g., must sort lots of data
Algorithm/Software parallelism, e.g., merge sort, doing pieces in parallel
System parallelism, e.g., OpenMP in C++ or MPI in Fortran
Hardware/Platform parallelism, e.g., multicore laptop
"It's only when you harmonize across those abstractions that you get performance, portability, robustness, and all of those other software engineering goodies. You can use MPI to implement a shared memory algorithm like Quicksort on top of a distributed memory machine, or pthreads to implement software on a uniprocessor... it's just not going to be pretty." (Wolff)
This "harmonization" is why surface acquaintance with parallelism isn't enough: broader and deeper study of parallelism is necessary to perceive how to harmonize all the layers.
"Harmonization" is also needed for sequential programming, but more tools have been developed.
For example, languages like
php
and frameworks like Ruby on Rails
are designed for building websites; FORTRAN, C, C++ have become
standard for scientific computing; Mathematica for mathematical
computing. Each has features, libraries, etc., to serve its primary
constituencies.
Influential book in OOP: Gamma et al, Design Patterns: Elements of Reusable Object-Oriented Software (1995)
Creational patterns , e.g., factories
Structural patterns, e.g., adapter ("callback")
Behavior patterns, e.g., templates
Increasingly, parallel computing community is searching for patterns.
Mattson et al, Patterns for Parallel Programming (2005). "A design pattern describes a good solution to a recurring problem in a particular context."
Finding concurrency design space
Algorithm structure design space
Supporting structures design space
Implementation mechanisms design space
Compare to Wolff's comments...
Finding concurrency
Decomposition patterns: Task Decomposition; Data Decomposition
Dependency analysis patterns: Group Tasks; Order Tasks; Data Sharing
Design Evaluation pattern
Algorithm Structure
Organize by tasks patterns: Task Parallelism; Divide and Conquer
Organize by data decomposition patterns: Geometric Decomposition; Recursive Data
Organize by flow of data patterns: Pipeline; Event-based Coordination
Supporting structures
Program structures patterns: SPMD; Master/Worker; Loop Parallelism; Fork/Join
Data structures patterns: Shared Data; Shared Queue; Distributed Array
Implementation mechanisms
UE (unit of execution) management: Thread creation/destruction; process creation/destruction; etc.
Synchronization: Memory synchronization and fences; barriers; exclusion; etc
Communication: Message passing; collective communication; other
We have seen many of these patterns in our work.
Drawn from Mattson, et al
Note: Each of the patterns below represents a design question. For example, there may be numerous approaches to Data Decomposition for a given problem and data set.
A computational problem typically includes a stream of instructions and a collection of data. This suggests two dimensions for parallelizing that problem (with efficiency considerations).
Task Decomposition
Decomposing the stream of instructions into tasks that can be executed simultaneously (and are relatively independent of each other)
______Data Decomposition
Splitting the data into chunks (that can be operated on relatively independently)
______Group Tasks
Group tasks in order to simplify the management of dependencies
______Order Tasks
Satisfying constraints among those groups of tasks
______Data Sharing
Sharing data among the tasks
Design Evaluation
Assessing the decomposition and dependency analysis
______Task Parallelism
Given a task decomposition, exploit the parallelism efficiently
______Divide and Conquer
Given a sequential divide-and-conquer strategy, exploit the parallelism efficiently
______Geometric Decomposition
Given a data decomposition into chunks, organize the algorithm.
______Recursive Data
Given a problem involving operations on recursive data structures, perform those operations efficiently in parallel
______Pipeline
Given a multi-stage computation, exploit the parallelism efficiently
______Event-based Coordination
Given an event-driven computation with ordering constraints among the events, exploit the parallelism efficiently
______SPMD
Structure the UEs (units of execution) of a parallel computation, for better manageability and easier integration with the core computation.
______Master/Worker
Organize a computation when the work must be balanced dynamically among a set of UEs
______Loop Parallelism
Parallelizing computationally intensive loops
______Fork/Join
Constructing a program in which potentially interrelated tasks come and go dynamically
______Shared Data
Explicitly and correctly managing data shared among parallel tasks
______Shared Queue
Correctly sharing a queue data structure among concurrent UEs
______Distributed Array
Creating readable and efficient programs that partition arrays among multiple UEs
______UE management: Thread creation/destruction; process creation/destruction
______Synchronization: Memory synchronization and fences; barriers; mutual exclusion
______Communication: Message passing; collective communication; other
______