Paper Title
Feosa: Fractional Ebola Optimization Search Algorithm based Speaker Segmentation and Speaker Diarization
Abstract
Speaker diarization model determines the uniform areas of the speaker in a group of audio recordings where
number of speakers exist. It provides the answers to query, like “who spoke when”. The speaker diarization database
generally includes meetings, reality shows, news and various speaker recordings. Speaker diarization has been devised
normally depending upon the clustering of speaker embeddings. However, clustering-based approaches face some
limitations in minimizing the diarization errors directly and in handling the speaker overlaps accurately. In order to
counterpart such limitations, a potent model is constructed for speaker segmentation and speaker diarization utilizing
Fractional Ebola Optimization Search Algorithm (FEOSA). At first, spectral features are extracted and then speaker activity
identification is accomplished to refine speech signals from non-speech signals. Thereafter, speaker segmentation is
performed depending upon speaker change detection in which computation of constant thresholds is done using proposed
FEOSA. Finally, speaker diarization task is conducted based on entropy weighted power k-means algorithm, where the
weights are upgraded using same proposed FEOSA developed by the consolidation of Fractional Calculus (FC) concept into
Ebola Optimization Search Algorithm (EOSA). Furthermore, proposed FEOSA has delivered maximum testing accuracy of
0.913, minimum diarization error of 0.566, FDR of 0.257, FNR of 0.128, and FPR of 0.104.
Keyword - Speech Signal, Speech Diarization, Fractional Calculus (FC), Deep Learning, Ebola Optimization Search
Algorithm (EOSA).