
-
The implications of outcome truncation in reproductive medicine RCTs: a simulation platform for trialists and simulation study
Randomised controlled trials in reproductive medicine are often subject ...
read it
-
PERF-Net: Pose Empowered RGB-Flow Net
In recent years, many works in the video action recognition literature h...
read it
-
Utterance-level Intent Recognition from Keywords
This paper focuses on wake on intent (WOI) techniques for platforms with...
read it
-
Compact Speaker Embedding: lrx-vector
Deep neural networks (DNN) have recently been widely used in speaker rec...
read it
-
RetinaTrack: Online Single Stage Joint Detection and Tracking
Traditionally multi-object tracking and object detection are performed u...
read it
-
Exploring Context, Attention and Audio Features for Audio Visual Scene-Aware Dialog
We are witnessing a confluence of vision, speech and dialog system techn...
read it
-
Leveraging Topics and Audio Features with Multimodal Attention for Audio Visual Scene-Aware Dialog
With the recent advancements in Artificial Intelligence (AI), Intelligen...
read it
-
Long Term Temporal Context for Per-Camera Object Detection
In static monitoring cameras, useful contextual information can stretch ...
read it
-
Structural sparsification for Far-field Speaker Recognition with GNA
Recently, deep neural networks (DNN) have been widely used in speaker re...
read it
-
Context, Attention and Audio Feature Explorations for Audio Visual Scene-Aware Dialog
With the recent advancements in AI, Intelligent Virtual Assistants (IVA)...
read it
-
Uncertainty aware multimodal activity recognition with Bayesian inference
Deep neural networks (DNNs) provide state-of-the-art results for a multi...
read it
-
Multimodal Relational Tensor Network for Sentiment and Emotion Classification
Understanding Affect from video segments has brought researchers from th...
read it
-
Learning to Segment via Cut-and-Paste
This paper presents a weakly-supervised approach to object instance segm...
read it
-
Rethinking Spatiotemporal Feature Learning For Video Understanding
In this paper we study 3D convolutional networks for video understanding...
read it
-
Progressive Neural Architecture Search
We propose a method for learning CNN structures that is more efficient t...
read it
-
Generative Models of Visually Grounded Imagination
It is easy for people to imagine what a man with pink hair looks like, e...
read it
-
Motion Prediction Under Multimodality with Conditional Stochastic Networks
Given a visual history, multiple future outcomes for a video scene are e...
read it
-
Spatially Adaptive Computation Time for Residual Networks
This paper proposes a deep learning architecture based on Residual Netwo...
read it
-
Speed/accuracy trade-offs for modern convolutional object detectors
The goal of this paper is to serve as a guide for selecting a detection ...
read it
-
Detecting events and key actors in multi-person videos
Multi-person event recognition is a challenging task, often with many pe...
read it
-
Generation and Comprehension of Unambiguous Object Descriptions
We propose a method that can generate an unambiguous description (known ...
read it
-
Deep Knowledge Tracing
Knowledge tracing---where a machine models the knowledge of a student as...
read it
-
Learning Program Embeddings to Propagate Feedback on Student Code
Providing feedback, both assessing final work and giving hints to stuck ...
read it
-
What's Cookin'? Interpreting Cooking Videos using Text, Speech and Vision
We present a novel method for aligning a sequence of instructions to a v...
read it
-
Tuned Models of Peer Assessment in MOOCs
In massive open online courses (MOOCs), peer grading serves as a critica...
read it
-
Efficient Probabilistic Inference with Partial Ranking Queries
Distributions over rankings are used to model data in various settings s...
read it
-
Uncovering the Riffled Independence Structure of Rankings
Representing distributions over permutations can be a daunting task due ...
read it