
-
Fair Attribute Classification through Latent Space De-biasing
Fairness in visual recognition is becoming a prominent and critical topi...
read it
-
Point and Ask: Incorporating Pointing into Visual Question Answering
Visual Question Answering (VQA) has become one of the key benchmarks of ...
read it
-
Towards Unique and Informative Captioning of Images
Despite considerable progress, state of the art image captioning models ...
read it
-
Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation
The ability to perform effective planning is crucial for building an ins...
read it
-
ViBE: A Tool for Measuring and Mitigating Bias in Image Datasets
Machine learning models are known to perpetuate the biases present in th...
read it
-
Take the Scenic Route: Improving Generalization in Vision-and-Language Navigation
In the Vision-and-Language Navigation (VLN) task, an agent with egocentr...
read it
-
Towards Fairer Datasets: Filtering and Balancing the Distribution of the People Subtree in the ImageNet Hierarchy
Computer vision technology is being used by many but remains representat...
read it
-
Compositional Temporal Visual Grounding of Natural Language Event Descriptions
Temporal grounding entails establishing a correspondence between natural...
read it
-
Towards Fairness in Visual Recognition: Effective Strategies for Bias Mitigation
Computer vision models learn to perform a task by capturing relevant sta...
read it
-
Human uncertainty makes classification more robust
The classification performance of deep neural networks has begun to asym...
read it
-
SpatialSense: An Adversarially Crowdsourced Benchmark for Spatial Relation Recognition
Understanding the spatial relations between objects in images is a surpr...
read it
-
CornerNet-Lite: Efficient Keypoint Based Object Detection
Keypoint-based methods are a relatively new paradigm in object detection...
read it
-
What Actions are Needed for Understanding Human Actions in Videos?
What is the right way to reason about human activities? What directions ...
read it
-
Learning to Learn from Noisy Web Videos
Understanding the simultaneously very diverse and intricately fine-grain...
read it
-
What's in a Question: Using Visual Questions as a Form of Supervision
Collecting fully annotated image datasets is challenging and expensive. ...
read it
-
Predictive-Corrective Networks for Action Detection
While deep feature learning has revolutionized techniques for static-ima...
read it
-
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data...
read it
-
Much Ado About Time: Exhaustive Annotation of Temporal Data
Large-scale annotated datasets allow AI systems to learn from and build ...
read it
-
End-to-end Learning of Action Detection from Frame Glimpses in Videos
In this work we introduce a fully end-to-end approach for action detecti...
read it
-
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
Every moment counts in action recognition. A comprehensive understanding...
read it
-
What's the Point: Semantic Segmentation with Point Supervision
The semantic image segmentation task presents a trade-off between test t...
read it
-
Joint calibration of Ensemble of Exemplar SVMs
We present a method for calibrating the Ensemble of Exemplar SVMs model....
read it
-
ImageNet Large Scale Visual Recognition Challenge
The ImageNet Large Scale Visual Recognition Challenge is a benchmark in ...
read it