A Law of Data Separation in Deep Learning

From MaRDI portal
Publication:6415581

arXiv2210.17020MaRDI QIDQ6415581

Author name not available (Why is that?)

Publication date: 30 October 2022

Abstract: Multilayer neural networks have achieved superhuman performance in many artificial intelligence applications. However, their black-box nature obscures the underlying mechanism for transforming input data into labels throughout all layers, thus hindering architecture design for new tasks and interpretation for high-stakes decision makings. We addressed this problem by introducing a precise law that governs how real-world deep neural networks separate data according to their class membership from the bottom layers to the top layers in classification problems. This law shows that each layer roughly improves a certain measure of data separation by an extit{equal} multiplicative factor. This law manifests in modern architectures such as AlexNet, VGGNet, and ResNet in the late phase of training. This law together with the perspective of data separation offers practical guidelines for designing network architectures, improving model robustness and out-of-sample performance during training, as well as interpreting deep learning predictions.




Has companion code repository: https://github.com/hornhehhf/equi-separation








This page was built for publication: A Law of Data Separation in Deep Learning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6415581)