Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
Audio classification in speech and music: A comparison between a statistical and a neural approach - MaRDI portal

Audio classification in speech and music: A comparison between a statistical and a neural approach (Q1607684)

From MaRDI portal





scientific article; zbMATH DE number 1779598
Language Label Description Also known as
English
Audio classification in speech and music: A comparison between a statistical and a neural approach
scientific article; zbMATH DE number 1779598

    Statements

    Audio classification in speech and music: A comparison between a statistical and a neural approach (English)
    0 references
    0 references
    0 references
    0 references
    14 October 2002
    0 references
    Summary: We focus on the problem of audio classification in speech and music for multimedia applications. In particular, we present a comparison between two different techniques for speech/music discrimination. The first method is based on zero crossing rate and Bayesian classification. It is very simple from a computational point of view and gives good results in case of pure music or speech. The simulation results show that some performance degradation arises when the music segment contains also some speech superimposed on music, or strong rhythmic components. To overcome these problems, we propose a second method that uses more features and that is based on neural networks (specifically a multi-layer perceptron). In this case we obtain better performance, at the expense of a limited growth in the computational complexity. In practice, the proposed neural network is simple to implement if a suitable polynomial is used as the activation function, and a real-time implementation is possible even if low-cost embedded systems are used.
    0 references
    audio classification
    0 references
    speech/music discrimination
    0 references
    zero crossing rate
    0 references
    Bayesian classification
    0 references
    neural networks
    0 references

    Identifiers