Distributed Machine Learning for Computational Engineering using MPI
From MaRDI portal
Publication:6352860
arXiv2011.01349MaRDI QIDQ6352860
Eric Darve, Kailai Xu, Weiqiang Zhu
Publication date: 2 November 2020
Abstract: We propose a framework for training neural networks that are coupled with partial differential equations (PDEs) in a parallel computing environment. Unlike most distributed computing frameworks for deep neural networks, our focus is to parallelize both numerical solvers and deep neural networks in forward and adjoint computations. Our parallel computing model views data communication as a node in the computational graph for numerical simulations. The advantage of our model is that data communication and computing are cleanly separated and thus provide better flexibility, modularity, and testability. We demonstrate using various large-scale problems that we can achieve substantial acceleration by using parallel solvers for PDEs in training deep neural networks that are coupled with PDEs.
Has companion code repository: https://github.com/kailaix/ADCME.jl
This page was built for publication: Distributed Machine Learning for Computational Engineering using MPI
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6352860)