Formalizing text editors in Coq

06/05/2020 ∙ by Boro Sitnikovski, et al. ∙ 0

Text editors represent one of the fundamental tools that writers use – software developers, book authors, mathematicians. A text editor must work as intended in that it allows the users to do their job. We will start by introducing a small subset of a full-blown text editor – line editor. Next, we will give a concrete definition (specification) of what a complete text editor means. Afterward, we will provide an implementation of a line editor in Coq, and then we will prove that it is a complete text editor.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A line editor is a text editor that works in REPL mode. It accepts several commands, and each of the commands operates on one or multiple lines of text. The most popular line editor is Unix ed [1]. We will show a short demo, interacting with the editor on a Unix shell:

$ ed example.txt
> i
> Hello World!
> Line two
> .
> n
2  Line two
> 1
> n
1  Hello World!
> d
> n
1  Line two
> w
> q

We start by editing the file example.txt. We will explain the commands that we used:

  • The command i starts the insertion mode and in the following lines it will accept contents that should be inserted.

  • The command . exits the insertion mode.

  • The command n shows the current line pointer along with the contents.

  • Inputting a number as a command will set the line pointer to that number.

  • The command d deletes the current line.

  • The command w writes the contents to the file.

  • The command q quits the editor.

A more generalized editor is a character editor, however, line editors are much more convenient, especially in the REPL mode. For example, it may be tricky for the user to keep track of the position of every character to read/insert/delete.

Coq is a programming language designed with an aim to accomplish software correctness [2].

2 Specification

Before we start formalizing editors, we will provide some definitions.

Definition 2.1.

A text editor is complete if it has the functionality to read, insert, and delete text at any position.

Here’s another definition that we’ll rely on. This definition is already supported in the base of Coq.

Definition 2.2.

Strings (list of characters) can be inserted (created), read, and changed.

We used the word change here, however, in Coq we don’t do any ”changes”, rather, we’ll be simply returning new (updated) strings.

Definition 2.3.

A line editor contains a buffer - list of strings.

Given these definitions, we can proceed with implementing them in Coq. The implementation in this paper will use line editors, however, a single character can still be changed in a line by deleting the line and inserting a new line with that character changed. Thus, the editor that we will implement will be complete according to the specifications.

2.1 Coq definitions

The editor has to be able to read a line (i.e. get n-th element of a list):

Definition readLine {X : Type} (b : list X) (pos : nat) (d : X) : X :=
  nth pos b d.

Further, the editor has to be able to insert a line (i.e. put an element in a list at a specific position):

Definition insertLine {X : Type} (b : list X) (pos : nat) (s : X) : (list X) :=
  firstn pos b ++ s :: nil ++ skipn pos b.

Finally, the editor needs to be able to delete a line (i.e. get first n-th elements of a list, skip n+1 elements of a list):

Definition deleteLine {X : Type} (b : list X) (pos : nat) : (list X) :=
  firstn pos b ++ skipn (pos + 1) b.

We will wrap all of that in a single EditorEval to make it more convenient:

Inductive Command {X : Type} : Type :=
  | ReadLine : nat -> X -> Command
  | InsertLine : nat -> X -> Command
  | DeleteLine : nat -> X -> Command.
Definition EditorEval {X : Type} (cmd : @Command X) (b : list X) : (list X * X) :=
  match cmd with
  | ReadLine   pos d => pair b (readLine b pos d)
  | InsertLine pos s => pair (insertLine b pos s) s
  | DeleteLine pos d => pair (deleteLine b pos) (readLine b pos d)

3 Formal proofs

3.1 Lemmas

In this subsection, we will provide the lemmas that will be used by our proofs.

The following lemma states that the length of the first elements of a list that contains at least elements is .

Lemma lemma_1 : forall {X:Type} (l : list X) (n : nat),
  n <= length l -> length (firstn n l) = n.
  exact firstn_length_le.

The next lemma states that whenever , we can deduce .

Lemma lemma_2 : forall n m, n = m -> n >= m.
  intros n m H.
  induction H.

Finally, lemma_3 states that when a list of length is concatenated with another list with an element in between, the -th element of the concatenated list will be (zero indexed).

Lemma lemma_3 : forall {X:Type} n l1 l2 (s:X) d, length l1 = n -> s = nth n (l1 ++ s :: l2) d.
  intros X n l1 l2 s d length_l1_eq_n.
  assert (n_eq_length_l1_to_gte_l1 := (lemma_2 n (length l1))).
  assert (n_gte_l1 := n_eq_length_l1_to_gte_l1 (eq_sym length_l1_eq_n)).
  rewrite (app_nth2 l1 (s::l2) d n_gte_l1).
  rewrite length_l1_eq_n.
  rewrite (Nat.sub_diag n).

The theorem thm_1 is a combination of lemma_3 and lemma_1.

Theorem thm_1 : forall {X:Type} (n : nat) (l1 l2 : list X) (s : X) (d : X), n <= length l1 -> s = nth n (firstn n l1 ++ s :: l2) d.
  intros x n l1 l2 s d n_lt_length_l.
  apply lemma_3. exact (lemma_1 l1 n n_lt_length_l).

3.2 Proofs

The line editor can insert any text, that is, for all strings and positions , there exists a buffer such that the string is in .

Theorem can_insert_text : forall (s : string) (n : nat), exists (b : list string), fst (EditorEval (InsertLine n s) b) = s :: nil.
  intros s n.
  unfold insertLine.
  exists nil. simpl.
  case n.
    - simpl. reflexivity.
    - intros. simpl. reflexivity.

Next, we will prove that the line editor can read any text, that is, for all strings , positions and buffers , where the buffer is at least of the length of the requested position, reading from the inserted string at the specific position will return the same string.

Theorem can_read_text : forall (s : string) (n : nat) (b : list string), n <= List.length b -> snd (EditorEval (ReadLine n "") (fst (EditorEval (InsertLine n s) b))) = s.
  intros s n b n_lt_buffer.
  unfold insertLine. unfold readLine.
  assert (a := thm_1 n b (nil ++ skipn n b) s "").
  exact (a n_lt_buffer).

Finally, we prove that the line editor can change any text. That is, there exists a function that ”changes” the value from to of reading an inserted line.

In the code, is defined as a combination of deletion and insertion.

Theorem can_change_text : forall (s1 s2 : string) (n : nat) (b : list string), n <= List.length b -> s1 = snd (EditorEval (ReadLine n "") b) -> s2 = (snd (EditorEval (ReadLine n "") (fst (EditorEval (InsertLine n s2) (fst (EditorEval (DeleteLine n "") b)))))).
  intros s1 s2 n b n_lt_b. simpl. intros s1_smth.
  unfold readLine. unfold insertLine.
  assert (a := thm_1 n (deleteLine b n) (skipn n (deleteLine b n)) s2 "").
  case a.
    - unfold deleteLine. simpl.
      rewrite (app_length (firstn n b) (skipn (n + 1) b)).
      rewrite (lemma_1 b n n_lt_b).
      exact (Nat.le_add_r n (List.length (skipn (n + 1) b))).
   - reflexivity.

4 Conclusion

We showed how to formally prove the functionality of text editors. We used line editors, but the same idea can be applied generally to text editors. We defined what a complete text editor means, and mapped those functionalities to Coq definitions. Most (if not all) text editors will use the same specifications. Further work can be focused on formalizing full-blown text editors, such as vim [3], or more specifically VimScript.


  • [1] Brian W. Kernighan A Tutorial Introduction to the UNIX Text Editor Bell Laboratories, New Jersey, 1997.
  • [2] Bruno Barras, Samuel Boutin, Cristina Cornes, Judicaël Courant, Jean-Christophe Filliâtre, et al. The Coq Proof Assistant Reference Manual: Version 6.1 [Research Report] RT-0203, INRIA, 1997.
  • [3] Bram Moolenaar VIM-Vi IMproved 2006.