Karan Sikka

research

∙ 09/08/2023

Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models

Vision-language models (VLMs) have recently demonstrated strong efficacy...

0 Yangyi Chen, et al. ∙

research

∙ 06/04/2023

Predicting Information Pathways Across Online Communities

The problem of community-level information pathway prediction (CLIPP) ai...

0 Yiqiao Jin, et al. ∙

research

∙ 12/14/2021

Dual-Key Multimodal Backdoors for Visual Question Answering

The success of deep learning has enabled advances in multimodal tasks th...

2 Matthew Walmer, et al. ∙

research

∙ 10/22/2021

Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To Benchmark

We focus on Multimodal Machine Reading Comprehension (M3C) where a model...

20 Pritish Sahu, et al. ∙

research

∙ 04/20/2021

Towards Solving Multimodal Comprehension

This paper targets the problem of procedural multimodal machine comprehe...

16 Pritish Sahu, et al. ∙

research

∙ 03/29/2021

Online Defense of Trojaned Models using Misattributions

This paper proposes a new approach to detecting neural Trojans on Deep N...

15 Panagiota Kiourti, et al. ∙

research

∙ 12/03/2020

Detecting Trojaned DNNs Using Counterfactual Attributions

We target the problem of detecting Trojans or backdoors in DNNs. Such mo...

7 Karan Sikka, et al. ∙

research

∙ 11/21/2020

Zero-Shot Learning with Knowledge Enhanced Visual Semantic Embeddings

We improve zero-shot learning (ZSL) by incorporating common-sense knowle...

4 Karan Sikka, et al. ∙

research

∙ 09/12/2020

RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization

We study an important, yet largely unexplored problem of large-scale cro...

13 Niluthpol Chowdhury Mithun, et al. ∙

research

∙ 03/16/2020

Deep Adaptive Semantic Logic (DASL): Compiling Declarative Knowledge into Deep Neural Networks

We introduce Deep Adaptive Semantic Logic (DASL), a novel framework for ...

10 Karan Sikka, et al. ∙

research

∙ 09/10/2019

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

While models for Visual Question Answering (VQA) have steadily improved ...

0 Arijit Ray, et al. ∙

research

∙ 07/14/2019

FoodX-251: A Dataset for Fine-grained Food Classification

Food classification is a challenging problem due to the large number of ...

0 Parneet Kaur, et al. ∙

research

∙ 05/17/2019

Deep Unified Multimodal Embeddings for Understanding both Content and Users in Social Media Networks

There has been an explosion of multimodal content generated on social me...

0 Karan Sikka, et al. ∙

research

∙ 04/19/2019

Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts

Computing author intent from multimodal data like Instagram posts requir...

18 Julia Kruk, et al. ∙

research

∙ 03/27/2019

Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment

We address the problem of grounding free-form textual phrases by using w...

0 Samyak Datta, et al. ∙

research

∙ 12/08/2018

Semantically-Aware Attentive Neural Embeddings for Image-based Visual Localization

We present a novel method for fusing appearance and semantic information...

0 Zachary Seymour, et al. ∙

research

∙ 07/04/2018

Understanding Visual Ads by Aligning Symbols and Objects using Co-Attention

We tackle the problem of understanding visual ads where given an ad imag...

0 Karuna Ahuja, et al. ∙

research

∙ 04/12/2018

Zero-Shot Object Detection

We introduce and tackle the problem of zero-shot object detection (ZSD),...

0 Ankan Bansal, et al. ∙

research

∙ 12/23/2017

Combining Weakly and Webly Supervised Learning for Classifying Food Images

Food classification from images is a fine-grained classification problem...

0 Parneet Kaur, et al. ∙

research

∙ 11/24/2016

AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos

We propose a novel method for temporally pooling frames in a video for t...

0 Amlan Kar, et al. ∙

research

∙ 08/08/2016

Discriminatively Trained Latent Ordinal Model for Video Classification

We study the problem of video classification for facial analysis and hum...

0 Karan Sikka, et al. ∙

research

∙ 04/06/2016

LOMo: Latent Ordinal Model for Facial Analysis in Videos

We study the problem of facial analysis in videos. We propose a novel we...

0 Karan Sikka, et al. ∙

research

∙ 12/17/2015

Deep Active Object Recognition by Joint Label and Action Prediction

An active object recognition system has the advantage of being able to a...

0 Mohsen Malmir, et al. ∙

research

∙ 10/24/2013

Pseudo vs. True Defect Classification in Printed Circuits Boards using Wavelet Features

In recent years, Printed Circuit Boards (PCB) have become the backbone o...

0 Sahil Sikka, et al. ∙

Karan Sikka

Featured Co-authors

Sign in with Google

Consider DeepAI Pro