ScaLAPACK: A portable linear algebra library for distributed memory computers -- design issues and performance
From MaRDI portal
Publication:1294620
DOI10.1016/0010-4655(96)00017-3zbMath0926.65148OpenAlexW4239025233MaRDI QIDQ1294620
Publication date: 30 November 1999
Published in: Computer Physics Communications (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/0010-4655(96)00017-3
softwareperformanceparallel algorithmsparallel computersScaLAPACKdistributed memory computersdistributed linear algebra machinelinear algebra computations
Symbolic computation and algebraic computation (68W30) Parallel numerical computation (65Y05) Packaged methods for numerical algorithms (65Y15) Numerical linear algebra (65Fxx)
Related Items
Algorithm 1022: Efficient Algorithms for Computing a Rank-Revealing UTV Factorization on Parallel Computing Architectures, An efficient parallel high-order compact scheme for the 3D incompressible Navier–Stokes equations, Scaling Up Parallel Computation of Tiled QR Factorizations by a Distributed Scheduling Runtime System and Analytical Modeling, High performance verified computing using C-XSC, Parallel multivariate slice sampling, An efficient approach to solve very large dense linear systems with verified computing on clusters, A sparse nonsymmetric eigensolver for distributed memory architectures, Order \(10^4\) speedup in global linear instability analysis using matrix formation, Key concepts for parallel out-of-core LU factorization, An efficient hybrid tridiagonal divide-and-conquer algorithm on distributed memory architectures, A heterogeneous parallel LU factorization algorithm based on a basic column block uniform allocation strategy, The Impact of Data Distribution in Accuracy and Performance of Parallel Linear Algebra Subroutines, Solving Dense Interval Linear Systems with Verified Computing on Multicore Architectures, ScaLAPACK, Considerations on the Implementation and Use of Anderson Acceleration on Distributed Memory and GPU-based Parallel Computers, Practical task-oriented parallelism for Gaussian elimination in distributed memory, Transient growth analysis of hypersonic flow over an elliptic cone
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Parallel matrix transpose algorithms on distributed memory concurrent computers
- The NX message passing interface
- On the correctness of some bisection-like parallel eigenvalue algorithms in floating point arithmetic
- Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs
- Basic Linear Algebra Subprograms for Fortran Usage
- A set of level 3 basic linear algebra subprograms