Statistical learning on measures: an application to persistence diagrams

From MaRDI portal
Publication:6429641

arXiv2303.08456MaRDI QIDQ6429641

Author name not available (Why is that?)

Publication date: 15 March 2023

Abstract: We consider a binary supervised learning classification problem where instead of having data in a finite-dimensional Euclidean space, we observe measures on a compact space mathcalX. Formally, we observe data DN=(mu1,Y1),ldots,(muN,YN) where mui is a measure on mathcalX and Yi is a label in 0,1. Given a set mathcalF of base-classifiers on mathcalX, we build corresponding classifiers in the space of measures. We provide upper and lower bounds on the Rademacher complexity of this new class of classifiers that can be expressed simply in terms of corresponding quantities for the class mathcalF. If the measures mui are uniform over a finite set, this classification task boils down to a multi-instance learning problem. However, our approach allows more flexibility and diversity in the input data we can deal with. While such a framework has many possible applications, this work strongly emphasizes on classifying data via topological descriptors called persistence diagrams. These objects are discrete measures on mathbbR2, where the coordinates of each point correspond to the range of scales at which a topological feature exists. We will present several classifiers on measures and show how they can heuristically and theoretically enable a good classification performance in various settings in the case of persistence diagrams.




Has companion code repository: https://github.com/olympioh/bba_measures_classification








This page was built for publication: Statistical learning on measures: an application to persistence diagrams

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6429641)