We study the capabilities of speech processing systems trained simply to...
Due to privacy concerns, multi-party gradient tree boosting algorithms h...
Text embeddings are useful features in many applications such as semanti...
Large pre-trained models such as CLIP offer consistent accuracy across a...
Recently, there have been breakthroughs in computer vision ("CV") models...
State-of-the-art computer vision systems are trained to predict a fixed ...
Multi-party machine learning is a paradigm in which multiple participant...
We introduce Jukebox, a model that generates music with singing in the r...
A point-of-interest (POI) recommendation system plays an important role ...
Automatic music transcription is considered to be one of the hardest pro...
The recent success of raw audio waveform synthesis models like WaveNet
m...
The task of estimating the fundamental frequency of a monophonic sound
r...