Speaker Recognition Research Papers

We present a method for speaker recognition that uses the duration patterns of speech units to aid speaker classification. The approach represents each word and/or phone by a feature vector comprised of either the durations of the... more

This paper presents the deep neural networks to classification of children with voice impairments from speech signals. In the analysis of speech signals, 6,373 static acoustic features are extracted from many kinds of... more

Bookmark
Download
- by Thomas Huang
- •
- 23
  Information Systems, Artificial Intelligence, Acoustics, Graph Theory

Most elderly people monitoring systems include the detection of abnormal situations, in particular distress situations, as one of their main goals. In order to reach this objective, many solutions end up combining several modalities such... more

The paper describes a multisensorial personidentification system: visual and acoustic cues are usedjointly for person identification. A simple approach,based on the fusion of the lists of scores produced independentlyby a speaker... more

Bookmark

Recently satisfactory results have been obtained in NIST speaker recognition evaluations. These results are mainly due to accurate modeling of a very large development dataset provided by LDC. However, for many realistic scenarios the use... more

Bookmark
Download
- by Shlomo Dubnov
- •
- 17
  Engineering, Harmonic Analysis, Speaker Recognition, Speech Processing

— An automatic verification of person's identity from its voice is a part of modern telecommunication services. In order to execute a verification task, a speech signal has to be transmitted to a remote server. So, a performance of the... more

In this paper we present an adapted UBM-GMM based privacy preserving speaker verification (PPSV) system, where the system is not able to observe the speech data provided by the user and the user does not observe the models trained by the... more

Bookmark
Download
- by José Portêlo and +1
  I. Trancoso
- •
- 11
  Approximation Theory, Privacy, Speaker Recognition, Support Vector Machines

This paper describes a new identity authentication technique by a synergetic use of lip-motion and speech. The lip-motion is defined as the distribution of apparent velocities in the movement of brightness patterns in an image and is... more

One characteristic that distinguishes speaker recognition (identification, verification, classification, tracking, etc.) from other biometrics is that it is designed to operate with devices and over channels that were created for other... more

In the meeting case scenario, audio is often recorded using Multiple Distance Microphones (MDM) in a non-intrusive manner. Typically a beamforming is performed in order to obtain a single enhanced signal out of the multiple channels. This... more

Il progetto verte sullo sviluppo di un sistema di riconoscimento del parlatore per l'esecuzione di comandi vocali. Il sistema è stato implementato in Python e si occupa del riconoscimento sia del linguaggio parlato che del parlatore. Per... more

This article deals with a technique of voice forgery using the ALISP (Automatic Language Independent Speech Processing) approach. Such a technique allows to transform the voice of an arbitrary person (the impostor), forging the identity... more

Bookmark
Download
- by Patrick Perrot and +1
  Gérard Chollet
- •
- 6
  Speaker Recognition, Automatic Speaker Recognition, Speech Processing, Speaker Verification

Speaker recognition is the computing task of validating a user's claimed identity using characteristics extracted from their voices. Voice -recognition is combination of the two where it uses learned aspects of a speaker’s voice to... more

Identification of non-native personnel is a critical piece of information for making crucial on-the-spot decisions for security purposes. Identification of a non-native speaker is often readily apparent in normal conversation with a... more

Device, language and environmental mismatch adversely affect speaker verification (SV) performance. We investigate such effects empirically based on the M3 (multibiometric, multilingual and multi-device) Corpus [1]. Device mismatch (among... more

Bookmark
Download
- by Man-wai Mak
- •
- 5
  Speaker Recognition, English, Biometrics, Speaker Verification

An audio-assisted system is investigated that detects if a movie scene is a dialogue or not. The system is based on actor indicator functions. That is, functions which define if an actor speaks at a certain time instant. In particular,... more

"Implementation of an Automatic Algorithm for syllabic division in Portuguese Language A new algorithm for automatic syllabic splitting in the Portuguese language is proposed, which is based on the envelope of the speech signal of an... more

Bookmark
Download
- by Lizandra Fernandes and +1
  h.m. de oliveira
- •
- 6
  Speaker Recognition, Audio Engineering, Audio Signal Processing, Speech Processing

Availability of databases is a necessity in the speech processing field. The publically available databases in Arabic language are few. In this paper we describe a rich database for Arabic language. The database is rich in many... more

Bookmark
Download
- by rahul kala
- •
- 19
  Machine Learning, Speaker Recognition, Fuzzy Logic, Face Recognition

Feeling of knowing (or expressed confidence) reflects a speaker's certainty or commitment to a statement and can be associated with one's trustworthiness or persuasiveness in social interaction. We investigated the perceptual-acoustic... more

Pre-processing of Speech Signal serves various purposes in any speech processing application. It includes Noise Removal, Endpoint Detection, Pre-emphasis, Framing, Windowing, Echo Canceling etc. Out of these, silence/unvoiced portion... more

Making no claim of being exhaustive, a review of the most popular MFCC (Mel Frequency Cepstral Coefficients) implementations is made. These differ mainly in the particular approximation of the nonlinear pitch perception of human, the... more

Bookmark
Download
- by Nikos Fakotakis and +1
  Todor D Ganchev
- •
- 3
  Speaker Recognition, Speech Processing, Feature Extraction

Reliable identity management must be built with an accurate user identity recognition method. This recognition usually is the core of the authentication method which is the essential part of any identity management system. The... more

It's me!" This pronouncement is usually made over the telephone or at an entryway out of sight of the intended hearer. It embodies the expectation that the sound of one's voice is sufficient for the hearer to recognize the speaker. In... more

Recent studies show that Gaussian mixture model (GMM) weights carry less, yet complementary, information to GMM means for language and dialect recognition. However, state-of-the-art language recognition systems usually do not use this... more

In this study, we present a binaural scene analyzer that is able to simultaneously localize, detect and identify a known number of target speakers in the presence of spatially positioned noise sources and reverberation. In contrast to... more

Speaker recognition is the process of recognizing a speaker’s identity by his or her voice. Humans sound differently and there are features in our speaking voice which differentiate us from other people. In this paper, we show an... more

This paper presents the feature analysis and design of compensators for speaker recognition under stressed speech conditions. Any condition that causes a speaker to vary his or her speech production from normal or neutral condition is... more

Conventional multimodal biometric identification systems tend to have larger memory footprint, slower processing speeds and a higher implementation and operational cost. In this paper we propose a state of the art framework for multimodal... more

Bookmark
Download
- by Réda Adjoudj and +1
  Youssef Elmir
- •
- 12
  Speaker Recognition, Support Vector Machines, Databases, Speech

Standard speaker recognition system employs a pre-processed form of an acoustic signal, which provides information about the distribution of signal energy across time and frequency. However, different signal representations may be... more

Bookmark
- by Duangkaew Sawamiphakdi
- •
- 7
  Engineering, Speaker Recognition, MFCC, Multilayer Perceptron

Speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information included in speech waves. This technique makes it possible to use the speaker's voice to verify their identity and... more

Bookmark
Download
- by sindu anju
- •
- Speaker Recognition

SPEAKER RECOGNATION SYSTEM (SRS)

SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the research and development of neural speech processing technologies by being simple, flexible, user-friendly, and well-documented. This paper... more

Bookmark
Download
- by Mirco Ravanelli and +3
  SUNG-LIN YEH
  François Grondin
  R. De-mori
- •
- 10
  Machine Learning, Signal Processing, Speaker Recognition, Audio Signal Processing

Dedicated to my parents who sacrificed their today for our better tomorrow and my mentors who have guided me throughout my research and helped me to improve professionally and personally. ACKNOWLEDGMENTS I would like to gratefully... more

A simple yet complex approach to modern sophistication. In this project we used the MFCC approach to build a unique and accurate coefficients extracting processor to extract feature from the voice stored in the database, then on the next... more

The paper discusses Derrida's concept of hospitality which perfectly describes the experience of loosing the sense of feeling at home and reveals the disintegrating entrance of the Otherness into a coherent home space. Jacques Derrida's... more

Bookmark
Download
- by Andrzej Marzec
- •
- 253
  Critical Theory, Languages, Biochemistry, Bioinformatics

ABSTRACT: Defining the vowel system in comparison with the previous similar works for standard Turkish by using a wider database and determining the rate of speaker specific invariance are two aims of this study. In the previous similar... more

ABSTRACT:
Defining the vowel system in comparison with the previous similar works for standard Turkish by using a wider database and determining the rate of speaker specific invariance are two aims of this study.
In the previous similar studies on standard Turkish regarding the determination of vowel system based on fundamental frequency and formants, it is notable that focus is on male speakers, the number of speakers is approximately 10 or 15 and generally only the first three formants are analysed. In the current study, fundamental frequency and first four formants of speech from 40 male and female speakers were analyzed. Minimum and maximum values, averages and standard deviations were calculated and compared with the previous studies. Besides, wovel quadrilateral that is used to show approximate positions of vowels inside the mouth during an utterance was produced based on the calculations mentioned above for male and female speakers, in an F1-F2 scatter plot.
To determine which variables are speaker specific to what extent, frequency values, skewness/kurtosis values that are calculated from statistical distribution of these frequency values and ratio of energy levels in different frequency levels during the utterance of a vowel were used as variables in this study. It is shown that, of all these variables, only the frequency values as a whole give statistically significant results and carry speaker specific information. As a result of this study, it is also shown that which of these frequency values are speaker specific to what extent and for what type of situations.

OZET:
Daha geniş bir veri tabanı kullanılarak ölçünlü Türkçe için ünlü dizgesinin daha önceki çalışmalarla karşılaştırmalı olarak saptanması ve konuşucuya özgü değişmezliğin oranının ortaya konması bu çalışmanın iki amacını meydana getirmektedir.
Ünlülerin temel frekans ve formant frekans değerlerine dayalı olarak saptanması konusunda ölçünlü Türkçe üzerine yapılan çalışmalara bakıldığında daha çok erkek konuşucular üzerinde yoğunlaşıldığı, konuşucu sayısının 10-15 civarında olduğu ve genellikle ilk üç formant frekansının çalışma kapsamına alındığı görülmektedir. Bu çalışmada 40 erkek ve 40 kadın konuşucuya ait sesler üzerinde inceleme yapılmış ve temel frekans ile ilk dört formant frekansı üzerinde inceleme yapılmıştır. İnceleme neticesinde ünlüler için en küçük ve en büyük değerler, ortalamalar ve standart sapmalar hesaplanmış ve diğer çalışmalardaki değerlerle karşılaştırılmıştır. Ayrıca elde edilen hesaplamalar kullanılarak kabaca ünlülerin ağız içindeki çıkış yerlerini gösteren ünlü dörtgeni, yapılmış olan ölçümlere dayalı olarak bir grafik üzerinde, erkek ve kadın konuşucular için oluşturulmuştur.
Hangi değişkenlerin ne ölçüde konuşucuya özgü olduğunun saptanması için frekans değerleri, bu değerlerin istatistiksel dağılımına dayalı olarak hesaplanan çarpıklık/basıklık değerleri ve ünlülerin sesletimi esnasında değişik frekans seviyelerindeki enerjilerin birbirine oranları değişkenler olarak kullanılmıştır. Bütün bu değişkenler içinde sadece frekans değerlerinin tamamının istatistiksel olarak anlamlı sonuçlar verdiği ve kişiye özgülük taşıdığı saptanmıştır. Çalışma sonunda bu frekans değerleri içinde hangilerinin, hangi durumlarda, ne oranda konuşucuya özgülük taşıdıkları ortaya konmuştur.

Biometric system performance can be improved by means of data fusion. Several kinds of information can be fused in order to obtain a more accurate classification (identification or verification) of an input sample. In this paper we... more

A wide variety of systems require reliable personal recognition schemes to either confirm or determine the identity of an individual requesting their services. The purpose of such schemes is to ensure that the rendered services are... more

Bookmark
Download
- by Anil Jain
- •
- 7
  Speaker Recognition, Face Recognition, Gesture Recognition, Data Privacy

This paper aims at inscribing Forensic Linguistics within the variegate field of Forensic sciences. After a deep and meticulous description of the state of art of Forensic Linguistics from 1960s until now, we propose all of the... more

Superbisor sa Programang Edukasyon sa Filipino at Mother Tongue Based-Multi Lingual Education ng Kagawaran ng Edukasyon, Lungsod ng Maynila. DepEd Concept Paper Writer, Translator, Editor and National Lead Trainer. International Book... more

Learning good representations without supervision is still an open issue in machine learning, and is particularly challenging for speech signals, which are often characterized by long sequences with a complex hierarchical structure. Some... more

Bookmark
Download
- by Mirco Ravanelli and +1
  Santi Pdp
- •
- 8
  Speaker Recognition, Automatic Speech Recognition, Speech Recognition, Unsupervised Learning Techniques

The QUT-NOISE-SRE protocol is designed to mix the large QUT-NOISE database, consisting of over 10 hours of background noise, collected across 10 unique locations covering 5 common noise scenarios, with commonly used speaker recognition... more

Bookmark
Download
- by Ahilan Kanagasundaram and +1
  Md Hafizur Rahman
- •
- 7
  Speaker Recognition, Speaker Verification, Speaker Identification, Noise

Tracking speakers in multiparty conversations constitutes a fundamental task for automatic meeting analysis. In this paper, we present a novel probabilistic approach to jointly track the location and speaking activity of multiple speakers... more

Speaker Recognition

Log In