|
      CS 631      
      Advanced Parallel Computing      
      Materials      
|
|
There are many, many books on parallel computing on a whole variety of
topics. However, most of the books are old, except for the parallel
programming books.
-
Michael McCool, Arch D. Robison, James Reinders,
Structured Parallel Programming,
Morgan Kaufmann, ISBN: 978-0-124-15993-8, 2012.
-
James Reinders, Jim Jeffers
High Performance Parallelism Pearls,
Morgan Kaufmann, ISBN: 978-0-128-02118-7, 2014.
-
James Reinders, Jim Jeffers
High Performance Parallelism Pearls, Volume Two,
Morgan Kaufmann, ISBN: 978-0-12-803819-2, 2015.
-
Georg Hager, Gerhard Wellein,
Introduction to High Performance Computing for Scientists and Engineers
,
CRC Press, Computational Science Series, 2010.
-
I. Foster,
Designing and Building Parallel Programs,
Addison Wesley, ISBN 0-201-57594-9, 1995.
The book is online!!!
-
Lin, Yun Calvin, Lawrence Snyder,
Principles of Parallel Programming,
Pearson/Addison Wesley, 2009.
-
K. Hwang, G. Fox, J. Dongarra,
Distributed and Cloud Computing: From Parallel Processing to the Internet of Things,
Morgan Kaufmann, ISBN: 9780123858801, 2011.
-
J. Dongarra, I. Foster, G. Fox, W. Gropp, K. Kennedy, L. Torczon, A. White,
The Sourcebook of Parallel Computing,
Morgan Kaufmann, First Edition, ISBN 1-55860-871-0, 2003.
The entire book is online!!!
-
D. Culler, J. Singh, A. Gupta,
Parallel Computer Architecture: A Hardware / Software
Approach,
Morgan Kaufmann, ISBN 1-55860-343-3, 1998.
-
G. Fox, R. Williams, P. Messina,
Parallel Computing Works,
Morgan Kaufmann, ISBN 1-55860-253-4, 1994. The entire book is online!!!
-
A. Koniges,
Industrial Strength Parallel Computing,
Morgan Kaufmann, ISBN 1-55860-540-1, 1999.
-
P. Pacheco,
An Introduction to Parallel Programming,
Morgan Kaufmann, ISBN: 9780123742605, 2011.
-
D. Kirk, W.-M. Hwu,
Programming Massively Parallel Processors: A Hands-on Approach,
Morgan Kaufmann, 2nd Edition, ISBN: 9780124159921, 2012.
-
M. Sottile, T. Mattson, C. Rassmussen,
Introduction to Concurrency in Programming Languages,
Chapman and Hall/CRC Press, ISBN: 1420072137, 2009.
Book materials (e.g., errata, links, reviews, examples) are online.
-
B. Chapman, G. Jost, R. van der Pas,
Using OpenMP: Portable Shared Memory Parallel Programming,
The MIT Press, ISBN-10: 0-262-53302-2, ISBN-13: 978-0-262-53302-7, 2007.
-
R. Chandra, R. Menon, L. Dagum, D. Kohr, D. Maydan, J. McDonald,
Parallel Programming in OpenMP,
Morgan Kaufmann, ISBN 1-55860-671-8, 2000.
-
B. Chapman, G. Jost, R. van der Pas,
Parallel Programming with OpenACC,
Morgan Kaufmann, Paperback ISBN: 9780124103979, 2016.
-
R. Farber,
Parallel Programming with OpenACC,
Morgan Kaufmann, ISBN: 9780124103979, 2016.
-
S. Chandrasekaran, G. Juckeland,
OpenACC for Programmers: Concepts and Strategies,
Addison-Wesley, ISBN-13: 9780134694283, 2017.
-
M. Snir, S. Otto, S. Huss-Lederman, D. Walker, J. Dongarra,
MPI: The Complete Reference,
MIT Press, ISBN 95-80471, 1995.
The book is online!!!
-
W. Gropp, S. Huss-Lederman, A. Lumsdaine, E. Lusk, B. Nitzberg,
W. Saphir, M. Snir,
MPI: The Complete Reference (Volume 2 - MPI 2 Extensions),
MIT Press, ISBN 0-262-57123-4, 1998.
-
W. Gropp, E. Lusk, A. Skjellum,
Using MPI: Portable Parallel Programming with the
Message Passing Interface,
MIT Press, ISBN 0-262-57133-1, 1999.
-
W. Gropp, E. Lusk, R. Thakur,
Using MPI-2: Advanced Features of the Message Passing
Interface,
MIT Press, ISBN 0-262-57132-3, 1999.
-
W. Gropp, T. Hoefler, R. Thakur, E. Lusk,
Using Advanced MPI: Modern Features of the Message-Passing Interface,
MIT Press, ISBN: 9780262527637, 2014.
-
P. Pacheco,
Parallel Programming with MPI,
Morgan Kaufmann, ISBN 1-55860-339-5, 1996.
The linked website contains additional information on MPI.
-
T. Mattson, Y. He, A. Koniges
The OpenMP Common Core: Making OpenMP Simple Again
,
MIT Press, 2019.
Please pass along links to other reference texts that you find interesting.
There are parallel courses offered at leading institutions where parallel
computing research is taking place that provide online information. Here
are a few.
-
Parallel Computing, COMP 422/534, J. Mellor-Crummey,
Rice University, Spring 2020.
-
Parallel Computing, CS 525, Ananth Grama, Purdue University, Spring 2021.
-
Parallel Computing, CS 484, L. Rauchwerger, University of Illinois
at Urbana-Champaign, Spring 2020.
-
Parallel Processors, CS 258, D. Culler, University of California,
Berkeley, Spring 1999.
This course uses Culler's book and has online lectures slides.
Please pass along links to other resources that you find interesting.
-
Writing Parallel Libraries with MPI -- The Good, the Bad, and the Ugly,
Torsten Hoefler, Keynote talk at EuroMPI 2011, Sept 21st 2011, Santorini, Greece.
-
Unified Parallel C (UPC), University of California, Berkeley.
See also UPC at George Washington University.
-
J. Nieplocha, R. Harrison,
"
Shared Memory Programming in Metacomputing Environments: The Global Array Approach,"
The Journal of Supercomputing, Vol. 11, pp: 119-136, 1997.
Global Arrays at PNNL.
Wikipedia.
-
D. Ozog, A. Kamil, Y. Zheng, P. Hargrove, J. Hammond, A. Malony, W. de Jong, K. Yelick,
"
A Hartree-Fock Application using UPC++ and the New DArray Library,"
IEEE International Parallel and Distributed Processing Symposium,
pp: 453-462, 2016.
-
K. Yelick, D. Bonachea, W.Y. Chen, P. Colella, K. Datta, J. Duell, S.L. Graham, et al.,
"
Productivity and performance using partitioned global address space languages,"
International Conference on Symbolic and Algebraic Computation,
Vol. 27, pp: 24-32, 2007.
-
F. Cantonnet, Y. Yao, M. Zahran, T. El-Ghazawi,
"
Productivity analysis of the upc language,"
18th International Parallel and Distributed Processing Symposium, 2004.
-
M. de Wael, S. Marr, Bruno de Fraine, Tom van Cutsem, Wolfgang de Meuter,
"
Partitioned Global Address Space Languages,"
ACM Computing Surveys, Vol. 47, Issue 4, July 2015.
-
J. Protic, M. Tomasevic, V. Milutinovic,
"
Distributed shared memory: concepts and systems,"
IEEE Parallel and Distributed Technology: Systems and Applications,
Vol. 4, Issue 2, pp: 63-71, Summer 1996.
-
B. Nitzberg, V. Lo,
"
Distributed Shared Memory: A Survey of Issues and Algorithms,"
Computer, Vol. 24, Issue 8, pp: 52-60, August 1991.
-
K. Li, P. Hudak,
"
Memory coherence in shared virtual memory systems,"
Fifth ACM symposium on Principles of Distributed Computing (PODC 1986),
pp: 229-239, August 1986.
-
V. Kumar, Y. Zheng, et al.,
"
HabaneroUPC++: a Compiler-free PGAS Library,"
8th International Conference on Partitioned Global Address Space Programming Models (PGAS 2014),
October 2014.
-
K. Yelick,
"
Supporting Irregular Applications with Partitioned Global Address Space Languages: UPC and UPC++,"
talk at the Argonne Training Program on Extreme-Scale Computing, 2014.
-
Charm++, Parallel Programming Laboratory, University of Illinois at
Urbana-Champaign.
-
K. Wheeler, R. Murphy, D. Thain,
Qthreads: An API for programming with millions of lightweight threads,"
IEEE International Symposium on Parallel and Distributed Processing (IPDPS), April 2008.
S. Olivier,
Qthreads: A Library for Lightweight Threading," talk at Sandia National Lab, January 2016.
-
Multi-Processor Computing (MPC) Framework,
overview presented at MPI Forum, 2017.
-
P. MacArthur, Q. Liu, R. Russell, F. Mizero, M. Veeraraghavan, J. Dennis,
"
An Integrated Tutorial on InfiniBand, Verbs, and MPI,"
IEEE Communications Surveys and Tutorials, Vol. 19, No. 4, FOURTH QUARTER 2017.
-
J. Willcock, T. Hoefler, N. Edmonds, A. Lumsdaine,
"
AM++: a generalized active message framework,"
19th international conference on Parallel architectures and compilation techniques (PACT 2010),
pp: 401-410, September 2010.
Talk on AM++.
-
T. von Eicken, D. Culler, S. Goldstein, K. Schauser,
"
Active messages: a mechanism for integrating communication and computation,"
In 25 years of the International Symposia on Computer Architecture, pp: 430–440, 1998.
-
S. Atchley, D. Dillow, G. Shipman, P. Geoffray, J. Squyres, G. Bosilca, R. Minnich,
"
The Common Communication Interface (CCI),"
IEEE 19th Annual Symposium on High Performance Interconnects (HOTI),
August 2011.
-
T. Hoefler, S. Di Girolamo, K. Taranov, R. Grant, R. Brightwell,
"
sPIN: High-performance streaming Processing in the Network,"
International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2017),
Article No. 59, November 2017.
-
M. Schliephake, X. Aguilar, E. Laure,
"
Design and Implementation of a Runtime System for Parallel Numerical Simulations on Large-Scale Clusters,"
Procedia Computer Science, Vol. 4, pp: 2105-2114, 2011.
-
K. Huck, A. Malony, S. Shende, D. Jacobsen,
"
Integrated Measurement for Cross-Platform OpenMP Performance Analysis,"
International Workshop on OpenMP (IWOMP):
Using and Improving OpenMP for Devices, Tasks, and More, pp: 146-160, 2014.
-
A Qawasmeh, A. Malik, B. Chapman, K. Huck, A. Malony,
"
Open Source Task Profiling by Extending the OpenMP Runtime API,"
International Workshop on OpenMP (IWOMP):
OpenMP in the Era of Low Power Devices and Accelerators, pp: 186-199, 2013.
-
M. Bari, N. Chaimov, A. Malik, K. Huck, B. Chapman, A. Malony, O. Sarood,
"
ARCS: Adaptive Runtime Configuration Selection for Power-Constrained OpenMP Applications,"
IEEE International Conference on Cluster Computing (CLUSTER), pp: 461-470, 2016.
-
C.Augonnet, S. Thibault, R. Namyst, P.-A. Wacrenier,
"
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures,"
Concurrency and Computation: Practice & Experience,
Vol. 23, Issue 2, pp: 187-198, February 2011.
Slides.
-
R. Blumofe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, Y. Zhou,
"
Cilk: an efficient multithreaded runtime system,"
Fifth ACM SIGPLAN symposium on Principles and Practice of Parallel Programming (PPOPP 1995)
pp: 207-216, July 1995.
-
G. Bosilca, A. Bouteiller, A. Danalis,
"
PaRSEC: Exploiting Heterogeneity to Enhance Scalability,"
Computing in Science and Engineering, Vol. 15, Issue 6,
pp: 36-45, November 2013.
-
B. Chamberlain, D. Callahan, H. Zima,
Parallel programmability and the chapel language,"
International Journal of High Performance Computing Applications,
Vol. 21, Issue 3, pp: 291-312, August 2007.
-
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, V, Sarkar,
X10: an object-oriented approach to non-uniform cluster computing,"
20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA 2005),
pp: 519-538, October 2005.
-
L. Dagum, R. Menon,
"
Openmp: an industry standard api for shared-memory programming,"
IEEE Computational Science and Engineering, Vol. 5, Issue 1,
pp: 46-55, January-March 1998.
-
H. Kaiser, T. Heller, B. Adelstein-Lelbach, A. Serio, D. Fey,
"
HPX: A Task Based Programming Model in a Global Address Space,"
8th International Conference on Partitioned Global Address Space Programming Models (PGAS 2014),
Article No. 6, October 2014.
-
S. Seo, et al.,
Argobots: A Lightweight Low-Level Threading and Tasking Framework,"
IEEE Transactions on Parallel and Distributed Systems,
Vol. 29, Issue 3, pp: 512-526, October 2017.
-
C. Simmendinger,
"
The GASPI API: a failure tolerant PGAS API for asynchronous dataflow on heterogeneous architectures,"
Sustained Simulation Performance, pp: 17–32, 2015.
-
K. Wheeler, R. Murphy, D. Thain,
"
Qthreads: An API for programming with millions of lightweight threads,"
IEEE International Symposium on Parallel and Distributed Processing (IPDPS 2008),
April 2008.
-
P. Thoman, K. Dichev, T. Heller, R. Iakymchuk, X. Aguilar, K. Hasanov, P. Gschwandtner,
P. Lemarinier, S. Markidis, H. Jordan, T. Fahringer, K. Katrinis, E. Laure,
D. Nikolopoulos,
"
A taxonomy of task-based parallel programming technologies for high-performance computing,"
Journal of Supercomputing, Vol. 74, pp: 1422–1434, 2018.
-
Intel Threading Building
Blocks (Intel TBB).
This is the main site for TBB.
-
Intel Parallel Building Blocks (PBB).
-
CUDA Zone, NVIDIA.
-
OpenCL, Kronos Group.
OpenCL on Wikipedia.
-
Parallel Research Kernels repository
.
Please pass along additional links to other resources that you find interesting.
-
TAU Performance System, Performance Research Laboratory, University
of Oregon.
-
Scalasca, Research Centre Juelich, Germany.
-
Vampir, Technical Universitaet, Dresden.
-
HPCToolkit, Rice University.
-
Paradyn/Dyninst, University of Wisconsin, Madision and
University of Maryland.
-
Open|SpeedShop, a DOE Office of Science and NNSA project.
-
TotalView, TotalView Technologies, A Rogue Wave Software Company.
-
MPI Performance Topics, Lawrence Livermore National Laboratory (LLNL)
-
Performance Analysis Tools, Lawrence Livermore National Laboratory (LLNL)
-
Virtual Institute -- High-Productivity
Supercomputing, contains training materials from
various tutorials offered by VI-HPS partners, including the
University of Oregon.
Please pass along additional links to tools that you find interesting.
-
K. Asanovic et al.,
"
A View of the Parallel Computing Landscape,"
Communications of the ACM, Vol. 52, Issue 10, pp: 56-67,
October 2009.
EECS technical report.
The Landscape of Parallel Computing Research: A View From
Berkeley. This is a Wiki with a bunch of material.
-
D. Khaldi, P. Jouvelot, C. Ancourt, F. Irigoin,
"
Task Parallelism and Data Distribution:
An Overview of Explicit Parallel Programming Languages,"
Kasahara H., Kimura K. (Eds), Languages and Compilers for Parallel Computing,
Lecture Notes in Computer Science, Vol. 7760, Springer, 2013.
-
C.A.R. Hoare,
"
Communicating Sequential Processes,"
Communications of the ACM, Vol. 21, No. 8, 1978.
-
A. Roscoe, C.A.R.Hoare,
"
The laws of OCCAM programming,"
Theoretical Computer Science,
Volume 60, Issue 2, pp: 177-229, September 1988.
-
Joe Armstrong,
"
Making reliable distributed systems in the presence of sodware errors,"
A Dissertation submitted to the Royal Institute of Technology,
December 2003.
-
V. Sunderam,
"
PVM: a framework for parallel distributed computing,"
Journal Concurrency: Practice and Experience, Vol. 2, Issue 4,
pp: 315-339, December 1990.
-
Top500 Supercomputer Sites.
-
A. Malony, B. Mohr, P. Beckman, D. Gannon, S. Yang, F. Bodin,
"
Performance analysis of pC++: a portable data-parallel programming system for scalable parallel computers,"
Eighth International Parallel Processing Symposium, April 1994.
-
HPCwire, website on high-performance computing.
-
Wikipedia, Keywords: parallel computing. This site contains quite a
nice collection of information on parallel computing.
-
K. Asanovic, et al.,
The Parallel Computing Laboratory at U.C. Berkeley: A Research Agenda Based on the Berkeley View
,
UCB/EECS-2008-23, EECS Department, University of California, Berkeley, March 2008.
Many of the reference texts have excellent and extensive bibliographies.
In addition, there are significant resources online for finding research
articles, books, and other references.
-
Parallel Numerical Algorithms: An Introduction, David Keyes.
This is the introducton chapter of his book of the same title.
-
MPF: A portable message passing facility for shared memory multiprocessors,
A. Malony, D. Reed, P. McGuire, ICPP, pp. 739-741, 1987.
-
Amdahl's Law (Wikipedia), includes link to Amdahl's original article.
-
Gustafson's Law (Wikipedia), includes link to Gustafson's original article.
-
Confessions of an Accidental Greenie: From Green Destiny
to the Green500, W. Feng, E2SC keynote talk, 2017.
-
K. Chandy, C. Kesselman,
"
Compositional C++: Compositional parallel programming,"
International Workshop on Languages and Compilers for Parallel Computing,
pp: 124-144, 1992.
-
D. Loveman,
"
High Performance Fortran,"
IEEE Parallel & Distributed Technology: Systems and Applications,
Vol. 1, Issue 1, pp: 25-42, February 1993.
-
Long and Winding Road Towards Efficient High-Performance Computing,
W. Jalby, D. Kuck, A. Malony, M. Masella, A. Mazouz, M. Popov, Proceedings of
the IEEE, Vol. 106, No. 11, pp: 1985-2003, November 2018.
-
There’s plenty of room at the Top: What will drive computer performance
after Moore’s law?
,
C. Leiserson, N. Thompson, J. Emer, B. Kuszmaul, B. Lampson, D. Sanchez, T. Schardl,
Science, Vol. 368, pp: 6495, 2020.
-
The Future of Computing Beyond Moore’s Law
,
John Shalf,
Philosophical Transactions, Royal Society, A 378: 20190061, January 2020.
-
Exascale Applications: Skin in the Game
,
F. Alexander, et al.,
Philosophical Transactions, Royal Society, A 378: 20190056, January 2020.
-
Numerical algorithms for high-performance computational science
,
,
J. Dongarra, L. Grigori, N. Higham,
Philosophical Transactions, Royal Society, A 378: 20190066, January 2020.
-
The Parallelism Motifs of Genomic Data Analysis
,
,
K. Yelick, et al.,
Philosophical Transactions, Royal Society, A 378: 20190394, January 2020.
-
B. de Supinski, T. Scogland, A. Duran, M. Klemm, S. Bellido,
S. Olivier, C. Terboven, T. Mattson,
The Ongoing Evolution of OpenMP
,
Proceedings of the IEEE, Vol 106, No. 11, 2018.
-
T. Mattson, T. Anderson, G. Georgakoudis,
PyOMP: Multithreaded Parallel Programming in Python
,
Computing in Science and Engineering, Vol. 23, No. 6, pp. 77-80, November 2021.
PyOMP: Parallel Multithreading that is fast AND Pythonic
,
SC'21 Booth talk.
-
T. Anderson, T. Mattson,
Multithreaded parallel Python through OpenMP support in Numba
,
20th Python in Science Conference, pp. 140-147, 2021.
Please pass along additional links to other resources that you find interesting.