Performance optimization of numerically intensive codes (Q2723187)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Performance optimization of numerically intensive codes |
scientific article; zbMATH DE number 1614134
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Performance optimization of numerically intensive codes |
scientific article; zbMATH DE number 1614134 |
Statements
3 July 2001
0 references
parallel optimization
0 references
computer architectures
0 references
data traffic
0 references
code optimization
0 references
high-performance programming
0 references
Performance optimization of numerically intensive codes (English)
0 references
The book offers a comprehensive, tutorial-style, hands-on, introductory and intermediate-level treatment of all essential ingredients for achieving high performance in numerical computations on modern computers. The authors explain computer architectures, data traffic, and issues related to performance of serial and parallel code optimization. They bridge the gap between the literature in system architecture, the literature in numerical methods, and the occasional descriptions of optimization topics in computer vendors' literature. It is a well written textbook for scientists, engineers, and students interested in computational science and high-performance programming.NEWLINENEWLINENEWLINEThe book is structured as follows. Chapter 2 offers a guide through basic notions of computer architectures, namely on-chip parallelism of superscalar architectures, memory hierarchy of RISC architectures, mapping rules for caches, taxonomy of cache misses, TLB misses, multilevel cache configurations, parallel architectures, shared memory, distributed memory, distributed shared memory, and vector architectures. Chapter 3 sets stage by pointing out a few basic efficiency guidelines, e.g., selection of best algorithm, use of efficient libraries, optimal data layout, and use of compiler optimizations. Chapter 4 discusses timing and profiling which are the means to determine the performance of a code. In Chapter 5 the crucial issues of floating point operations which dominate in number-crunching scientific code are discussed. Chapter 6 discusses issues pertinent to optimization of memory access which is the major bottleneck on machines with a memory hierarchy. Brief discussions of further optimization techniques, e.g., balancing the load of the functional units are offered in Chapter 7. Parallel optimization is the topic of Chapter 8. The book closes with some serial and parallel case studies.
0 references