Reducing the quantity of annotations required for supervised training is...
Reducing the quantity of annotations required for supervised training is...
We combine recent advancements in end-to-end speech recognition to
non-a...
Image captioning models generally lack the capability to take into accou...