Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis

07/15/2022
by   Shounak Naik, et al.
0

Representational Similarity Analysis is a method from cognitive neuroscience, which helps in comparing representations from two different sources of data. In this paper, we propose using Representational Similarity Analysis to probe the semantic grounding in language models of code. We probe representations from the CodeBERT model for semantic grounding by using the data from the IBM CodeNet dataset. Through our experiments, we show that current pre-training methods do not induce semantic grounding in language models of code, and instead focus on optimizing form-based patterns. We also show that even a little amount of fine-tuning on semantically relevant tasks increases the semantic grounding in CodeBERT significantly. Our ablations with the input modality to the CodeBERT model show that using bimodal inputs (code and natural language) over unimodal inputs (only code) gives better semantic grounding and sample efficiency during semantic fine-tuning. Finally, our experiments with semantic perturbations in code reveal that CodeBERT is able to robustly distinguish between semantically correct and incorrect code.

READ FULL TEXT
research
09/24/2021

CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models

Pre-Trained Vision-Language Models (VL-PTMs) have shown promising capabi...
research
10/16/2021

The Power of Prompt Tuning for Low-Resource Semantic Parsing

Prompt tuning has recently emerged as an effective method for adapting p...
research
05/22/2023

"According to ..." Prompting Language Models Improves Quoting from Pre-Training Data

Large Language Models (LLMs) may hallucinate and generate fake informati...
research
04/22/2021

Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand?

Language models trained on billions of tokens have recently led to unpre...
research
02/25/2022

On the data requirements of probing

As large and powerful neural language models are developed, researchers ...
research
09/26/2022

Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding

This paper explores the task of Temporal Video Grounding (TVG) where, gi...
research
11/16/2022

Towards Computationally Verifiable Semantic Grounding for Language Models

The paper presents an approach to semantic grounding of language models ...

Please sign up or login with your details

Forgot password? Click here to reset