Introduction
CS 300 (PDC)
What is parallel computing?
Some general terms
- process
- The execution of a program
- processing element
- A unit of circuitry for carrying out computation.
- memory
- "Local" storage. Memory hierarchy.
- interconnection network
- Connections between processing elements and memories.
- host
- A computer in a distributed system.
- degree of parallelism
- Number of processing elements used in a parallel algorithm.
- remote
- Separated by a computer network, e.g.,
www.gmail.comis a remote host from a St. Olaf host. - local
- Nearby; on the host you're using (when contrasted with remote
- resource
- In general: memory/storage, processors, network. In a particular context: anything needed for a computation (e.g., a piece of information, a variable, a network connection, ...).
- protocol
- Rules for correct communication in a computation. Examples: programmer-defined; HTTP, FTP, SSH, TCP/IP, Ethernet
- sequential computation
- A sequential computation takes place on a single processing element.
- parallel computation
- Uses multiple processing elements to carry out a task.
- scale
- Relative size, thought of as orders of magnitude. A major concern: does a (protocol, algorithm, system) "scale up"?
- robust
- A robust system keeps working in the face of challenges.
- fault-tolerant
- Capacity for a system or algorithm to continue working correctly, even when there are failures.
- ______
- ______
Standalone computing means computing on a single computer, as perceived by the user. In the strictest sense, no network connections would be used in standalone computing, although we will consider a computation as "standalone" if the only networking used is a connection to a single computer located elsewhere, on which a task's computation is performed.
A lot of processor-level and computer-level parallel computing takes place under the surface when one uses standalone computing. However, the user or programmer needs not explicitly direct that parallelism in order to use or program the system.
Levels of parallelism: some general categories.
Processor-level, i.e., within a processor, among microarchitecture components
Computer-level, i.e., within a computer enclosure, among processors
Distributed, i.e., within a computer network, among computers
Interactive, batch
Example applications
client-server applications: One host, the server (which is typically a remote computer), acts as a source for computing resource(s); other hosts obtain those resources over a network. Example resources: files via the NFS protocol (a file server); computation, such as CPET providing Scheme computation to the wiki; web pages via the HTTP protocol (web server; a browser is a web client).
parallel BLAST: The BLASTalgorithm efficiently compares a given genetic sequence against a collection of known genetic sequences, identifying common subsequences and assessing how many changes would be required to transform the given sequence to each of the known sequences. Parallel BLAST performs these comparisons in parallel.
map-reduce: A three-phase framework for efficient, robust, scalable parallel computation with large data sets, adapted by Google for web computations from an earlier LISP (cf. Scheme) problem-solving strategy.
High-performance computing (HPC) focuses on delivering large amounts of computing for relatively short amounts of time, e.g., minutes or days.
High-throughput computing (HTC) also focuses on delivering large amounts of computing, but for longer amounts of time, e.g., months or years [Condor project].
Forms of parallelism
Processor level: pipeline, superscalar, array, vector, etc. (HD course)
Computer level: shared memory multiprocessor, pipeline, "supercomputer"
Distributed level: client-server, remote procedure call, streaming (cf. pipeline), distributed operating system, cluster
cycle scavenging; SETI@home
Beowulf clusters
A Beowulf cluster is a networked collection of off-the-shelf computers that are used together to solve (large-scale) computing problems.
Nodes, head node, worker node.
dedicated clusters, rack-mounted clusters
St. Olaf's Beowulf clusters
(Up to) three clusters:
Helios, MistRider, CastawayAdmin nodes:
helios.public.stolaf.edu, mist.public.stolaf.eduDocumentation: Beowiki (
http://www.cs.stolaf.edu/projects/bw/)Some history...