SUPERVOICE is a text-Independent speaker verification system which ultilizes ultrasound in human speech. It can be integrated with any Deep Neural Network based speaker identification model. SUPERVOICE is highly accurate in the speaker recognition task, outperforming all existing speaker verification models. SUPERVOICE is also capable of differentiating between the speech from genuine human and replay audios from attackers. With SUPERVOICE, we are exploring a new direction of human voice research by scrutinizing the unique characteristics of human speech at the ultrasound frequency band. When considering human speech production, our research indicates that the high-frequency ultrasound components (e.g. speech fricatives) from 20 to 48 kHz can significantly enhance the security and accuracy of speaker verification. SUPERVOICE could significantly enhance the existing speaker verification systems. This website shows the dataset we used to evaluate SUPERVOICE.
SuperVoice was discovered by the following team of academic researchers:
SUPERVOICE dataset (Voice-1 and Voice-2) is the first voice dataset with voice data collected by recorders with a high-frequency sampling rate as highly as 48kHz and 192kHz.
Voice-1 is collected by Avisoft condenser microphone CM16, which includes the voice data from 78 volunteers, totalling 7,800 utterances using 192 kHz sampling rate. Among the 78 volunteers, most of them are college students with ages ranging from 18 to 56, including 38 males and 40 females. Voice-2 is constructed by recording 25 sentences by 50 participants with different models of smartphones, which includes 1,250 utterances with 48 kHz sampling rate.
The table below shows a comparison with other high-profile datasets.
77 person, 7700 audio samples, 100 audios/person
There are 4 types of sentences included by the dataset.
We assign different speakers with different sentences, to assure every sentences are read by different speaker.
Transcript: She had your dark suit in greasy wash water all year.
Transcript: Please identify me.
Transcript: Don't ask me to carry an oily rage like that.