Introduction
CS 300 (PDC)
Some general terms
www.gmail.com
is a remote host from a St. Olaf host.Standalone computing means computing on a single computer, as perceived by the user. In the strictest sense, no network connections would be used in standalone computing, although we will consider a computation as "standalone" if the only networking used is a connection to a single computer located elsewhere, on which a task's computation is performed.
A lot of processor-level and computer-level parallel computing takes place under the surface when one uses standalone computing. However, the user or programmer needs not explicitly direct that parallelism in order to use or program the system.
Levels of parallelism: some general categories.
Processor-level, i.e., within a processor, among microarchitecture components
Computer-level, i.e., within a computer enclosure, among processors
Distributed, i.e., within a computer network, among computers
Interactive, batch
Example applications
client-server applications: One host, the server (which is typically a remote computer), acts as a source for computing resource(s); other hosts obtain those resources over a network. Example resources: files via the NFS protocol (a file server); computation, such as CPET providing Scheme computation to the wiki; web pages via the HTTP protocol (web server; a browser is a web client).
parallel BLAST: The BLASTalgorithm efficiently compares a given genetic sequence against a collection of known genetic sequences, identifying common subsequences and assessing how many changes would be required to transform the given sequence to each of the known sequences. Parallel BLAST performs these comparisons in parallel.
map-reduce: A three-phase framework for efficient, robust, scalable parallel computation with large data sets, adapted by Google for web computations from an earlier LISP (cf. Scheme) problem-solving strategy.
High-performance computing (HPC) focuses on delivering large amounts of computing for relatively short amounts of time, e.g., minutes or days.
High-throughput computing (HTC) also focuses on delivering large amounts of computing, but for longer amounts of time, e.g., months or years [Condor project].
Processor level: pipeline, superscalar, array, vector, etc. (HD course)
Computer level: shared memory multiprocessor, pipeline, "supercomputer"
Distributed level: client-server, remote procedure call, streaming (cf. pipeline), distributed operating system, cluster
cycle scavenging; SETI@home
A Beowulf cluster is a networked collection of off-the-shelf computers that are used together to solve (large-scale) computing problems.
Nodes, head node, worker node.
dedicated clusters, rack-mounted clusters
(Up to) three clusters: Helios, MistRider,
Castaway
Admin nodes: helios.public.stolaf.edu, mist.public.stolaf.edu
Documentation: Beowiki
(http://www.cs.stolaf.edu/projects/bw/
)
Some history...