A Formalization of SQL with Nulls

03/25/2020
by   Wilmer Ricciotti, et al.
0

SQL is the world's most popular declarative language, forming the basis of the multi-billion-dollar database industry. Although SQL has been standardized, the full standard is based on ambiguous natural language rather than formal specification. Commercial SQL implementations interpret the standard in different ways, so that, given the same input data, the same query can yield different results depending on the SQL system it is run on. Even for a particular system, mechanically checked formalization of all widely-used features of SQL remains an open problem. The lack of a well-understood formal semantics makes it very difficult to validate the soundness of database implementations. Although formal semantics for fragments of SQL were designed in the past, they usually did not support set and bag operations, nested subqueries, and, crucially, null values. Null values complicate SQL's semantics in profound ways analogous to null pointers or side-effects in other programming languages. Since certain SQL queries are equivalent in the absence of null values, but produce different results when applied to tables containing incomplete data, semantics which ignore null values are able to prove query equivalences that are unsound in realistic databases. A formal semantics of SQL supporting all the aforementioned features was only proposed recently. In this paper, we report about our mechanization of SQL semantics covering set/bag operations, nested subqueries, and nulls, written the Coq proof assistant, and describe the validation of key metatheoretic properties.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2022

Translating Canonical SQL to Imperative Code in Coq

SQL is by far the most widely used and implemented query language. Yet, ...
research
05/28/2019

One SQL to Rule Them All

Real-time data analysis and management are increasingly critical for tod...
research
06/06/2018

Extended Diffix

A longstanding open problem is that of how to get high quality statistic...
research
10/25/2019

Rumble: data independence when data is in a mess

This paper introduces Rumble, an engine that executes JSONiq queries on ...
research
10/04/2021

Prolog as a Querying Language for MongoDB

Today's database systems have shown to be capable of supporting AI appli...
research
07/23/2021

Comprehending nulls

The Nested Relational Calculus (NRC) has been an influential high-level ...
research
12/12/2021

Graph Pattern Matching in GQL and SQL/PGQ

As graph databases become widespread, JTC1 – the committee in joint char...

Please sign up or login with your details

Forgot password? Click here to reset