Performance optimization of numerically intensive codes (Q2723187)

The book offers a comprehensive, tutorial-style, hands-on, introductory and intermediate-level treatment of all essential ingredients for achieving high performance in numerical computations on modern computers. The authors explain computer architectures, data traffic, and issues related to performance of serial and parallel code optimization. They bridge the gap between the literature in system architecture, the literature in numerical methods, and the occasional descriptions of optimization topics in computer vendors' literature. It is a well written textbook for scientists, engineers, and students interested in computational science and high-performance programming.NEWLINENEWLINENEWLINEThe book is structured as follows. Chapter 2 offers a guide through basic notions of computer architectures, namely on-chip parallelism of superscalar architectures, memory hierarchy of RISC architectures, mapping rules for caches, taxonomy of cache misses, TLB misses, multilevel cache configurations, parallel architectures, shared memory, distributed memory, distributed shared memory, and vector architectures. Chapter 3 sets stage by pointing out a few basic efficiency guidelines, e.g., selection of best algorithm, use of efficient libraries, optimal data layout, and use of compiler optimizations. Chapter 4 discusses timing and profiling which are the means to determine the performance of a code. In Chapter 5 the crucial issues of floating point operations which dominate in number-crunching scientific code are discussed. Chapter 6 discusses issues pertinent to optimization of memory access which is the major bottleneck on machines with a memory hierarchy. Brief discussions of further optimization techniques, e.g., balancing the load of the functional units are offered in Chapter 7. Parallel optimization is the topic of Chapter 8. The book closes with some serial and parallel case studies.

0 references

reviewed by

Paul Molitor

0 references