Parallel Scientific Computing

AvFrédéric Magoules,François-Xavier Roux

Inbunden, Engelska, 2015

1 807 kr

Beställningsvara. Skickas inom 5-8 vardagar. Fri frakt över 249 kr.

Beskrivning

Parallel Scientific ComputingScientific computing has become an indispensable tool in numerous fields, such as physics, mechanics, biology, finance and industry. For example, it enables us, thanks to efficient algorithms adapted to current computers, to simulate, without the help of models or experimentations, the deflection of beams in bending, the sound level in a theater room or a fluid flowing around an aircraft wing.This book presents the scientific computing techniques applied to parallel computing for the numerical simulation of large-scale problems; these problems result from systems modeled by partial differential equations. Computing concepts will be tackled via examples.Implementation and programming techniques resulting from the finite element method will be presented for direct solvers, iterative solvers and domain decomposition methods, along with an introduction to MPI and OpenMP.

Produktinformation

Utgivningsdatum:2015-12-15
Mått:165 x 241 x 25 mm
Vikt:694 g
Format:Inbunden
Språk:Engelska
Antal sidor:384
Förlag:ISTE Ltd and John Wiley & Sons Inc
ISBN:9781848215818

Utforska kategorier

Hårdvara inom Data och IT

Mer om författaren

Frédéric Magoulès is Professor at Ecole Centrale Paris in France and Honorary Professor at the University of Pecs in Hungary. His research focuses on parallel computing and numerical linear algebra.François-Xavier Roux is Professor at Université Pierre et Marie Curie and an Engineer at ONERA, in France. His research focuses on parallel computing and numerical analysis.Guillaume Houzeaux is Team Leader at the Barcelona Supercomputing Center in Spain. His research focuses on high performance computational mechanics.

Innehållsförteckning

Preface xiIntroduction xvChapter 1. Computer Architectures 11.1. Different types of parallelism 11.1.1. Overlap, concurrency and parallelism 11.1.2. Temporal and spatial parallelism for arithmetic logic units 41.1.3. Parallelism and memory 61.2. Memory architecture 71.2.1. Interleaved multi-bank memory 71.2.2. Memory hierarchy 81.2.3. Distributed memory 131.3. Hybrid architecture 141.3.1. Graphics-type accelerators 141.3.2. Hybrid computers 16Chapter 2. Parallelization and Programming Models 172.1. Parallelization 172.2. Performance criteria 192.2.1. Degree of parallelism 192.2.2. Load balancing 212.2.3. Granularity 212.2.4. Scalability 222.3. Data parallelism 252.3.1. Loop tasks 252.3.2. Dependencies 262.3.3. Examples of dependence 272.3.4. Reduction operations 302.3.5. Nested loops 312.3.6. OpenMP 342.4. Vectorization: a case study 372.4.1. Vector computers and vectorization 372.4.2. Dependence 382.4.3. Reduction operations 392.4.4. Pipeline operations 412.5. Message-passing 432.5.1. Message-passing programming 432.5.2. Parallel environment management 442.5.3. Point-to-point communications 452.5.4. Collective communications 462.6. Performance analysis 49Chapter 3. Parallel Algorithm Concepts 533.1. Parallel algorithms for recurrences 543.1.1. The principles of reduction methods 543.1.2. Overhead and stability of reduction methods 553.1.3. Cyclic reduction 573.2. Data locality and distribution: product of matrices 583.2.1. Row and column algorithms 583.2.2. Block algorithms 603.2.3. Distributed algorithms 643.2.4. Implementation 66Chapter 4. Basics of Numerical Matrix Analysis 714.1. Review of basic notions of linear algebra 714.1.1. Vector spaces, scalar products and orthogonal projection 714.1.2. Linear applications and matrices 744.2. Properties of matrices 794.2.1. Matrices, eigenvalues and eigenvectors 794.2.2. Norms of a matrix 804.2.3. Basis change 834.2.4. Conditioning of a matrix 85Chapter 5. Sparse Matrices 935.1. Origins of sparse matrices 935.2. Parallel formation of sparse matrices: shared memory 985.3. Parallel formation by block of sparse matrices: distributed memory 995.3.1. Parallelization by sets of vertices 995.3.2. Parallelization by sets of elements 1015.3.3. Comparison: sets of vertices and elements 101Chapter 6. Solving Linear Systems 1056.1. Direct methods 1056.2. Iterative methods 106Chapter 7. LU Methods for Solving Linear Systems 1097.1. Principle of LU decomposition 1097.2. Gauss factorization 1137.3. Gauss–Jordan factorization 1157.3.1. Row pivoting 1187.4. Crout and Cholesky factorizations for symmetric matrices 121Chapter 8. Parallelization of LU Methods for Dense Matrices 1258.1. Block factorization 1258.2. Implementation of block factorization in a message-passing environment 1308.3. Parallelization of forward and backward substitutions 135Chapter 9. LU Methods for Sparse Matrices 1399.1. Structure of factorized matrices 1399.2. Symbolic factorization and renumbering 1429.3. Elimination trees 1479.4. Elimination trees and dependencies 1529.5. Nested dissections 1539.6. Forward and backward substitutions 159Chapter 10. Basics of Krylov Subspaces 16110.1. Krylov subspaces 16110.2. Construction of the Arnoldi basis 164Chapter 11. Methods with Complete Orthogonalization for Symmetric Positive Definite Matrices 16711.1. Construction of the Lanczos basis for symmetric matrices 16711.2. The Lanczos method 16811.3. The conjugate gradient method 17311.4. Comparison with the gradient method 17711.5. Principle of preconditioning for symmetric positive definite matrices 180Chapter 12. Exact Orthogonalization Methods for Arbitrary Matrices 18512.1. The GMRES method 18512.2. The case of symmetric matrices: the MINRES method 19312.3. The ORTHODIR method 19612.4. Principle of preconditioning for non-symmetric matrices 198Chapter 13. Biorthogonalization Methods for Non-symmetric Matrices 20113.1. Lanczos biorthogonal basis for non-symmetric matrices 20113.2. The non-symmetric Lanczos method 20613.3. The biconjugate gradient method: BiCG 20713.4. The quasi-minimal residual method: QMR 21113.5. The BiCGSTAB 217Chapter 14. Parallelization of Krylov Methods 22514.1. Parallelization of dense matrix-vector product 22514.2. Parallelization of sparse matrix-vector product based on node sets 22714.3. Parallelization of sparse matrix-vector product based on element sets 22914.3.1. Review of the principles of domain decomposition 22914.3.2. Matrix-vector product 23114.3.3. Interface exchanges 23314.3.4. Asynchronous matrix-vector product with non-blocking communications 23614.3.5. Comparison: parallelization based on node and element sets 23614.4. Parallelization of the scalar product 23814.4.1. By weight 23914.4.2. By distributivity 23914.4.3. By ownership 24014.5. Summary of the parallelization of Krylov methods 241Chapter 15. Parallel Preconditioning Methods 24315.1. Diagonal 24315.2. Incomplete factorization methods 24515.2.1. Principle 24515.2.2. Parallelization 24815.3. Schur complement method 25015.3.1. Optimal local preconditioning 25015.3.2. Principle of the Schur complement method 25115.3.3. Properties of the Schur complement method 25415.4. Algebraic multigrid 25715.4.1. Preconditioning using projection 25715.4.2. Algebraic construction of a coarse grid 25815.4.3. Algebraic multigrid methods 26115.5. The Schwarz additive method of preconditioning 26315.5.1. Principle of the overlap 26315.5.2. Multiplicative versus additive Schwarz methods 26515.5.3. Additive Schwarz preconditioning 26815.5.4. Restricted additive Schwarz: parallel implementation 26915.6. Preconditioners based on the physics 27515.6.1. Gauss–Seidel method 27515.6.2. Linelet method 276Appendices 279Appendix 1 281Appendix 2 301Appendix 3 323Bibliography 339Index 343