Type Prediction With Program Decomposition and Fill-in-the-Type Training

by   Federico Cassano, et al.

TypeScript and Python are two programming languages that support optional type annotations, which are useful but tedious to introduce and maintain. This has motivated automated type prediction: given an untyped program, produce a well-typed output program. Large language models (LLMs) are promising for type prediction, but there are challenges: fill-in-the-middle performs poorly, programs may not fit into the context window, generated types may not type check, and it is difficult to measure how well-typed the output program is. We address these challenges by building OpenTau, a search-based approach for type prediction that leverages large language models. We propose a new metric for type prediction quality, give a tree-based program decomposition that searches a space of generated types, and present fill-in-the-type fine-tuning for LLMs. We evaluate our work with a new dataset for TypeScript type prediction, and show that 47.4 overall rate of 3.3 type errors per file. All code, data, and models are available at: https://github.com/GammaTauAI/opentau.


page 1

page 2

page 3

page 4


Towards Neural Functional Program Evaluation

This paper explores the capabilities of current transformer-based langua...

Program Synthesis with Large Language Models

This paper explores the limits of the current generation of large langua...

PAC Prediction Sets for Large Language Models of Code

Prediction sets have recently been shown to be a promising strategy for ...

Do Machine Learning Models Produce TypeScript Types That Type Check?

Type migration is the process of adding types to untyped code to gain as...

Typesafe Coordinate Systems in High-Throughput Sequencing Applications

High-throughput sequencing file formats and tools encode coordinate inte...

Generalized Planning in PDDL Domains with Pretrained Large Language Models

Recent work has considered whether large language models (LLMs) can func...

MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms

We introduce a large-scale dataset of math word problems and an interpre...

Please sign up or login with your details

Forgot password? Click here to reset