SuperVoice: Text-Independent Speaker Verification Using Ultrasound Energy in Human Speech

05/28/2022
by   Hanqing Guo, et al.
0

Voice-activated systems are integrated into a variety of desktop, mobile, and Internet-of-Things (IoT) devices. However, voice spoofing attacks, such as impersonation and replay attacks, in which malicious attackers synthesize the voice of a victim or simply replay it, have brought growing security concerns. Existing speaker verification techniques distinguish individual speakers via the spectrographic features extracted from an audible frequency range of voice commands. However, they often have high error rates and/or long delays. In this paper, we explore a new direction of human voice research by scrutinizing the unique characteristics of human speech at the ultrasound frequency band. Our research indicates that the high-frequency ultrasound components (e.g. speech fricatives) from 20 to 48 kHz can significantly enhance the security and accuracy of speaker verification. We propose a speaker verification system, SUPERVOICE that uses a two-stream DNN architecture with a feature fusion mechanism to generate distinctive speaker models. To test the system, we create a speech dataset with 12 hours of audio (8,950 voice samples) from 127 participants. In addition, we create a second spoofed voice dataset to evaluate its security. In order to balance between controlled recordings and real-world applications, the audio recordings are collected from two quiet rooms by 8 different recording devices, including 7 smartphones and an ultrasound microphone. Our evaluation shows that SUPERVOICE achieves 0.58 rate in the speaker verification task, it only takes 120 ms for testing an incoming utterance, outperforming all existing speaker verification systems. Moreover, within 91 ms processing time, SUPERVOICE achieves 0 in detecting replay attacks launched by 5 different loudspeakers.

READ FULL TEXT

page 4

page 6

page 8

research
09/03/2019

Voice Spoofing Detection Corpus for Single and Multi-order Audio Replays

The evolution of modern voice controlled devices (VCDs) in recent years ...
research
07/12/2022

NEC: Speaker Selective Cancellation via Neural Enhanced Ultrasound Shadowing

In this paper, we propose NEC (Neural Enhanced Cancellation), a defense ...
research
04/06/2019

ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems

This paper introduces a new database of voice recordings with the goal o...
research
06/02/2021

A Continuous Liveness Detection System for Text-independent Speaker Verification

Voice authentication is drawing increasing attention and becomes an attr...
research
01/31/2019

Discriminate natural versus loudspeaker emitted speech

In this work, we address a novel, but potentially emerging, problem of d...
research
08/09/2020

Agricultural Knowledge Management Using Smart Voice Messaging Systems: Combination of Physical and Human Sensors

The use of the Internet of Things (IoT) in agricultural knowledge manage...
research
12/24/2021

SoK: A Study of the Security on Voice Processing Systems

As the use of Voice Processing Systems (VPS) continues to become more pr...

Please sign up or login with your details

Forgot password? Click here to reset