Antidote SQL: Relaxed When Possible, Strict When Necessary

02/10/2019
by   Pedro Lopes, et al.
0

Geo-replication poses an inherent trade-off between low latency, high availability and strong consistency. While NoSQL databases favor low latency and high availability, relaxing consistency, more recent cloud databases favor strong consistency and ease of programming, while still providing high scalability. In this paper, we present Antidote SQL, a database system that allows application developers to relax SQL consistency when possible. Unlike NoSQL databases, our approach enforces primary key, foreign key and check SQL constraints even under relaxed consistency, which is sufficient for guaranteeing the correctness of many applications. To this end, we defined concurrency semantics for SQL constraints under relaxed consistency and show how to implement such semantics efficiently. For applications that require strict SQL consistency, Antidote SQL provides support for such semantics at the cost of requiring coordination among replicas.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

page 8

page 9

page 12

page 13

page 14

09/07/2019

Compiling PL/SQL Away

"PL/SQL functions are slow," is common developer wisdom that derives fro...
03/25/2020

A Formalization of SQL with Nulls

SQL is the world's most popular declarative language, forming the basis ...
02/23/2018

IPA: Invariant-preserving Applications for Weakly-consistent Replicated Databases

Storage systems based on Weak Consistency provide better availability an...
05/04/2019

An experiment with denotational semantics

The paper is devoted to showing how to systematically design a programmi...
09/16/2021

Quantifying and Generalizing the CAP Theorem

In distributed applications, Brewer's CAP theorem tells us that when net...
03/27/2013

Deciding Consistency of Databases Containing Defeasible and Strict Information

We propose a norm of consistency for a mixed set of defeasible and stric...
01/07/2019

Popular SQL Server Database Encryption Choices

This article gives an overview of different database encryption choices ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

SQL databases have been the de facto standard for storing and managing data for many years. With the advent of cloud computing, and the need to scale applications to millions of users worldwide, new storage systems were designed that offered improved latency, availability and scalability over traditional SQL databases, giving rise to the NoSQL movement [15, 11].

To provide such properties, these NoSQL systems exhibit some weaknesses: (i) they only provide weak forms of consistency, which makes it difficult to ensure database integrity and application correctness; (ii) many of these systems only provide a key-value interface, which makes it difficult to model and query data efficiently.

These issues have led to a renewed interest in SQL, with the proposal of new designs that provide SQL semantics – Spanner [10] , Aurora [28], CosmosDB [2] and VoltDB [3] are some recent examples of such databases. These systems offer practical high availability and scalability, but they are unable to ensure low latency at a global scale, as they rely on some form of consensus [16] to ensure consistency across sites. For achieving low latency and high availability, it remains necessary to resort to weak consistency.

In this paper, we propose to allow programmers to relax SQL consistency when possible, while keeping stricter consistency when necessary. Some systems [18, 26] provide an API with operations that run under weak or strong consistency, which could be used for this purpose. However, it has been shown that it is difficult to identify which operations need to execute under each consistency model, with several methodologies and tools being proposed to help programmers in this process [18, 17, 5, 13, 24].

We adopt a different approach: use the database schema to specify the degree of concurrency allowed. With our concurrency-aware database schema, programmers identify which data items can be modified concurrently and what should be the outcome of such concurrent updates. Additionally, they also specify which database constraints should be maintained and the degree of concurrency allowed while enforcing them. The database system is then responsible for efficiently enforcing the defined data model, minimizing the coordination used. This approach gives full control to programmers, as they explicitly define when and how SQL consistency can be relaxed. In any case, our approach enforces database constraints, which is often sufficient for guaranteeing application correctness.

An important part of our work is the definition of sensible semantics when handling concurrent updates. For the outcome of concurrent updates to the same data items, we have built on previous works [17, 25], allowing programmers to select the appropriate merge policy. For supporting database constraints, including primary key, check and foreign key constraints, we propose alternative semantics for dealing with concurrent updates. While some semantics adopt an eventual consistency approach that poses no restriction to concurrent updates by applying pre-defined merge policies, other semantics restrict some concurrent updates. Nonetheless, in the latter case, a high degree of concurrency is still possible.

Implementing our approach, Antidote SQL, in a geo-replicated setting, is challenging, as data can be partitioned across multiple nodes in each data center. First, enforcing referential integrity might involve relations between data stored in different nodes, which could require complex coordination among nodes for maintaining the database constraints. We have devised a set of algorithms that avoid the need for coordination among multiple nodes, thus leading to a simple and efficient solution. Second, while adopting semantics that restrict concurrency, it is important not to be over-restrictive. Our proposed semantics and supporting algorithms achieve this goal.

In summary, this paper makes the following contributions:

  • A database schema allowing to control when and how SQL consistency can be relaxed.

  • The definition of sensible semantics for enforcing SQL constraints under weak consistency.

  • A set of algorithms for enforcing the defined concurrency semantics.

2 System overview

Antidote SQL is designed for running in cloud infrastructures, composed by multiple data centers, each one with multiple nodes. Each data center fully replicates the database. Inside each data center, data is sharded, with each shard being replicated in a small number of nodes.

Antidote SQL provides a SQL-like interface, AQL, to applications. Applications define the database schema using the AQL data definition language (DDL). AQL DDL extends SQL DDL by allowing programmers to specify the concurrency semantics for the database. This concurrency semantics includes specifying what concurrency is allowed when accessing the database and what should be the outcome of concurrent updates.

Applications access the database by issuing transactions that include a sequence of standard SQL statements, including the select statement for querying the database and insert, update and delete statements for updating the database.

AQL transactions run under parallel snapshot isolation (PSI) semantics [27] extended with integrity constraints. PSI is a an extension of snapshot isolation (SI) for geo-replicated settings. As SI, PSI precludes write-write conflicts between concurrent transactions, unless they are writes to mergeable data types. However, unlike SI, PSI allows different sites to order transactions differently, if the order preserves causal ordering: if a transaction T2 reads from T1, then T1 must be ordered before T2 at every data center. Under PSI, all operations of a transaction running in a given site, read the most recent committed version at that site as of the time of transaction begin.

We extend PSI to enforce integrity constraints. Under this model, at every site, all snapshots preserve the integrity constraints defined in the database schema, including primary key, check and foreign key constraints. As discussed later, integrity constraints can be enforced using both optimistic and pessimistic approaches, with the former being a highly available solution that solves conflicts according to the user defined policy (see Section 3.2).

3 Concurrency Semantics

Antidote SQL allows programmers to control the allowed concurrency among transactions through the database schema. When concurrency is allowed, an important aspect is the concurrency semantics, which defines the outcome in the presence of concurrent updates. This section discusses the supported concurrency semantics.

3.1 Database Model

Antidote SQL supports a relational data model, where data is stored in tables with a given schema. We now present the options for controlling concurrency associated with each table.

Semantics for update-delete: When creating a table, programmers can specify whether it will be possible to concurrently update and delete a table row. AQL provides three possible semantics (Figure 1): update-wins, delete-wins and no concurrency (if no modifier is specified). In the update-wins semantics, when concurrent transactions execute a delete and an update operation over the same row, the effects of the delete over that row are ignored. In the delete-wins semantics, the effect of the delete will prevail and the row is deleted. In the no concurrency semantics, concurrent transactions cannot execute a delete and an update operation over the same row.

The first two semantics lead to a lost update, as one of the operations will be ignored. However, the final state of the database depends only on the type of operations concurrently executed, and not on an arbitrary order among updates established at runtime, as it is the case for example in last-writer-wins solutions [20].

CREATE [UPDATE_WINS|DELETE_WINS] TABLE table_name(
  column1 datatype [constraint],
  column2 datatype [constraint],
  ...
  column_n datatype [constraint]
)
Figure 1: AQL create table statement.

Semantics for update-update: Programmers can specify which updates can be made concurrently to the same row when defining the table schema. To this end, AQL provides the following modifiers for the columns (Figure 2): last-writer-wins, multi-value and additive.

generic_modifier ::= LWW | MULTI_VALUE
numeric_modifier ::= generic_modifier | ADDITIVE
Figure 2: Modifiers for AQL data types.

In the last-writer-wins semantics, when concurrent updates modify the same row, the value of the last update (as ordered according to the wall clock) will prevail. In the multi-value semantics, when concurrent updates modify the same row, the database will store both values. This option should be used carefully, as it will affect the result returned by select operations, with multiple values being returned (instead of a single one). Finally, the additive semantics, for being used with numeric data types, allows the final state to merge all updates to the numeric value. Thus, given two concurrent update operations that add and to a column, the final database state will have the initial value of the column incremented by .

If no modifier is used for a given column, the system will not allow concurrent updates that modify this column in the same row. Updates that modify this column in different rows are allowed.

The semantics of update-update also control whether it is possible to concurrently insert a row with the same primary key. If all columns (besides the primary key) have a concurrency modifier, concurrent inserts are allowed, with the final state being determined by using the the defined semantics for each column.

3.2 Integrity Constraints

In the previous section, we presented the options for controlling concurrency among multiple clients by restricting concurrent updates to the same data items or adopting appropriate merge policies. We now present the semantics for controlling concurrent accesses that may lead to a database constraint violation (Figure 3 presents the syntax for specifying constraints).

constraint ::=
    PRIMARY KEY |
    CHECK (condition) |
    FOREIGN KEY [UPDATE_WINS|DELETE_WINS]
      REFERENCES table(column) [ON DELETE CASCADE]
Figure 3: Integrity constraints supported by AQL.

Primary key constraint: The primary key constraint is used to guarantee that the value of the primary key column is unique for each row in the table. We support two alternative approaches to guarantee this constraint.

First, if some column of the table (other than the primary key) includes the no concurrency semantics, no concurrent inserts will be allowed.

Second, if all columns (besides the primary key) include a concurrency semantics, AQL will allow multiple insert operations to be executed concurrently. The final value of each column is determined according to its concurrency semantics.

Both approaches guarantee that a single row with a given primary key exists, with the former restricting concurrency. One practical aspect that is important for primary keys is how applications concurrently generate different primary keys. To this end, AQL provides two functions, one returning a unique identifier and the other a sequential unique identifier (encoded as a number).

Check constraint: The check constraint allows to specify that the value of a column respects some given condition. For example, the check constraint can be used to guarantee that the stock of some product is not negative.

AQL allows programmers to specify check constraints for any column. For numeric additive columns, AQL allows the value of the column to be updated concurrently, when it is possible to guarantee that the updates will not make the condition false. As detailed later, to support this constraint, our prototype relies on escrow techniques [23].

A transaction running at a site aborts if an update that modifies the value of a column that has a check constraint may lead to an invariant violation. The database will return to the application information that allows the programmer to know if the transaction might commit if retried.

Foreign key constraint: The foreign key constraint allows to relate entries from different tables, by making the values of a column in one table uniquely identify rows in some other table.

Foreign key constraints are particularly challenging in our system model, as a constraint violation can result from concurrent updates in different tables. Consider the example of Figure 4. In this example, the database contains two tables, Artists and Albums, where Albums has a foreign key in column Artist that references the Artists table. In the example, starting in a database state where only artist Sam exists, two transactions concurrently delete the artist and add an album for the artist. When combining the effects of the two transactions, we would reach a state with an album referring to a deleted artist, leading to a violation of the foreign key constraint.

Figure 4: Example of foreign key constraint violation.

AQL supports the following concurrency semantics for handling updates that affect a foreign key constraint: update-wins semantics, delete-wins semantics and no concurrency.

In the update-wins semantics, when concurrently deleting row and inserting a row that references row , the delete has no effect in the final database state – Figure 4(a) shows the effect of update-wins in the previous example. Conversely, in the delete-wins semantics, it is the insert operations that will have no effect in the final database state – Figure 4(b) shows the effect of delete-wins in the previous example.

In the no concurrency semantics, the system will not allow the concurrent deletion of a row and the insertion of some row that references . We note that in this case, it is still possible to have multiple concurrent inserts that reference the same row.

(a) Update-wins semantics.
(b) Delete-wins semantics.
Figure 5: Semantics for solving foreign key constraint violations.

We now discuss the case when a foreign key is defined with the on cascade delete behavior. Consider the example of Figure 6, that starts with a database state including artist Sam with an album A0. A transaction adds album A1 for Sam, while a concurrent transaction deletes artist Sam. The cascading effect leads to the deletion of album A0 (that was the only known album in the site where the delete was executed). Combining the effects of both transactions leads to a foreign key constraint violation with album A1 referring to the deleted artist Sam.

Figure 6: Example of foreign key with cascading constraint violation.

With cascading, the delete-wins semantics has exactly the same behavior as before, making the concurrent insert to have no effect – as shown in Figure 6(b), the final database state does not include album A1.

For the update-wins semantics, different alternatives could be considered. First, the delete operation could have no effect – in this case, the final database state would include both albums A0 and A1. Second, for the delete operation, only the effects that would lead to a foreign key violation would be ignored – in this case, the final database state would include only album A1. We chose the latter option, as it is the one where less effects are ignored – Figure 6(a) exemplifies this case. The general rule adopted in AQL is the following: when the effects of an operation are ignored due to a concurrent operation, we try to minimize the effects ignored.

(a) Update-wins semantics.
(b) Delete-wins semantics.
Figure 7: Semantics for solving foreign key constraint violations.

3.3 Discussion

AQL allows programmers to have full control of when and how to relax SQL semantics by specifying the degree and outcome of concurrency allowed in the database schema. For data that is critical to application correctness, the programmer can select to forbid concurrent accesses, thus keeping strict SQL consistency, or to allow concurrent accesses given that database constraints are maintained using appropriate semantics.

For example, consider a database for an on-line shop. For guaranteeing that some product is not oversold, the programmer can use a check constraint that achieves this goal while allowing concurrent updates to commit while there is plenty of stock available.

As in any other database, one can expect that a large number of foreign key constraints exist. Consider the following foreign keys for a shopping cart: the shopping cart refers to the client that owns it; a shopping cart entry refers to a shopping cart and to a product. The programmer can use the AQL no concurrency semantics to prevent concurrent updates that would break the foreign key constraints and ultimately the application correctness. Alternatively, she could use the delete-wins semantics to guarantee that, when a shopping cart is deleted, all information associated with the shopping cart is also deleted despite any concurrent updates. This would typically lead to the expected result for the application.

4 Algorithms and Prototype

The concurrency semantics presented in the previous section allow programmers to control the degree of concurrency allowed in their applications and to reason about the behavior of their applications when concurrency is allowed. In this section, we briefly present the algorithms we have developed for efficiently implementing the proposed concurrency semantics in a geo-replicated setting, where data is partitioned across multiple nodes in each data center.

Antidote SQL is the SQL interface for AntidoteDB222http://antidotedb.org, a geo-replicated transactional key-value store with CRDT objects [25] and highly-available transactions via PSI (see Section 2). For mapping the relational data to AntidoteDB’s interface and supporting SQL operations efficiently, our prototype uses techniques that have been employed in other SQL interfaces for key-value stores. Each row of a table is mapped to a key/value pair, where the key is built from the table name and primary key, and the value stores the contents of the row. For supporting queries efficiently, our prototype maintains a primary key index and secondary indexes (if the programmer creates such indexes). We now focus on how to support the AQL concurrency semantics efficiently. Due to space limitations, we omit here the aspects related with the interaction between index maintenance and concurrent updates to indexed values, and with garbage-collection, that is performed asynchronously.

Multi-level locks: For supporting the no concurrency semantics, our prototype resorts to a distributed implementation of a multi-level lock (MLL) [5], with two modes: shared and exclusive. Each lock is controlled in two levels. First, the lock can be owned in exclusive mode by a single data center or in shared mode by any set of data centers. Second, an exclusive lock owned by a data center can be acquired by a single transaction running in that data center. A shared lock can be acquired by multiple transactions.

4.1 Database Model

Update-delete semantics: The no concurrency semantics is implemented by requiring a transaction to acquire: (i) in shared mode, the locks for the primary keys of the rows modified by an update operation; and (ii) in exclusive mode, the locks for the primary keys of the rows deleted by a delete operation.

For supporting the update-wins and delete-wins semantics, we use an hidden column (visibility column) in each row to control whether the row has been deleted or not. When a delete operation is executed, the column is assigned the value D. When the row is updated (or inserted), the column is assigned the value I. This column is implemented using a multi-value register CRDT, that stores all values assigned concurrently to the register. Thus, when an update operation executes concurrently with a delete operation for the same row, the final value of the visibility column will include both D and I. For a table with the update-wins semantics, a row is considered as deleted if and only if the only value of the visibility column is D. For a table with the delete-wins semantics, a row is considered as deleted if and only if one of the values of the visibility column is D.

Figure 8: Example of the evolution of the visibility column for concurrent update and delete operations.

Update-update semantics: For supporting the merge policies associated with each column, we build on the CRDTs supported by AntidoteDB. Thus, the last-write-wins semantics is implemented by storing the value of the column in a last-writer-wins register CRDT. The multi-value semantics is implemented by storing the value of the column in a multi-value register CRDT. The additive semantics is implemented by storing the value of the column in a counter CRDT.

For supporting the no concurrency semantics for a column, we associate a lock with the primary key and column name. An update operation that modifies the column must acquire the lock in exclusive mode before proceeding.

4.2 Integrity Constraints

Primary key constraint: Primary key constraints can be enforced with two different approaches, as explained in Section 3.2. In the case that some column uses the no concurrency semantics, we use MLLs to enforce mutual exclusion. For generating sequential unique identifiers we also use a MLL. For (non-sequential) unique identifier, the identifiers are generated with a prefix per site to avoid identifier collisions.

In the second case, where every column (except the primary key) uses some concurrency semantics, no mechanism for preventing concurrent inserts is necessary, as each column in the row can be merged with the specified concurrency semantics.

Check constraints: To support check constraints, for all columns other than numeric additive columns, it suffices to check that the column value conforms to the specified condition when a row is inserted or the column is updated.

For columns with the additive semantics, Antidote SQL relies on the bounded counter CRDT [6] available in AntidoteDB. The bounded counter CRDT implements the Escrow model [23]: permissions are granted to each holder of the counter (a replica) to execute operations without coordination as long as the local delta on the value of the counter does not exceed some threshold (and the sum of all thresholds still meets the defined condition). This ensures that after propagating the deltas executed in each replica, the value of the counter always meets the defined constraint. If some replica needs to exceed its current threshold, it can negotiate with another replica to change its threshold.

Foreign key constraint: When designing the algorithms for enforcing the different semantics supported for the foreign key constraint, an important aspect to consider is that, as shown in Figure 4, a conflict may occur due to updates performed in different tables. Thus, it is not possible to detect a conflict simply by checking the occurrence of concurrent updates to the same data item.

For the no concurrency semantics, we use MLLs to control concurrent accesses that could break the foreign key constraint, requiring a transaction to acquire: (i) an exclusive lock for deleting a row in the parent table; and (ii) a shared lock on the parent table for inserting a row in the child table. Thus, in our running example, a delete in table Artists will require an exclusive lock for the primary keys of the deleted rows; an insert (or update) in the table Albums requires a shared lock for the primary key of the referenced row. We note that this approach allows insertions to execute concurrently – in many applications, this will be the general case, thus enabling transactions to proceed concurrently in multiple data centers.

Figure 9: Evolution of visibility flags for update-wins foreign key constraint (example of Figure 6(a)).

For implementing the update-wins semantics, we resort to the visibility flags associated with each row and extend the effects of insert operations. Figure 9 exemplifies the algorithm implemented, using the example previously presented in Figure 6(a). A delete of a row that has no child will succeed, with the row marked as deleted by setting its visibility flag to D. If the row is referenced by other rows, the delete will only succeed if the foreign key constraint was declared as delete on cascade. In this case, both the parent and child rows are marked as deleted. This can be seen in Figure 9, with the deletion of artist Sam (and cascading delete of album A0).

When inserting a row that references another row, we mark the parent row as touched, by setting its visibility flag to T – in our example, the insertion of Album A1 sets the visibility flag of artist Sam to T. By making the visibility flag T stronger than D, we can make sure that in this case the parent row will not be deleted. In our example, when merging the concurrent updates, the visibility flags associated with artist Sam include both T and D. For update-wins foreign key semantics, where T is stronger than D, this means that the row is visible (a row is visible unless its visibility flag is only D). For the albums, album A0 remains deleted and album A1 is visible, as defined in our update-wins semantics.

Figure 10: Evolution of metadata for delete-wins foreign key constraint (example of Figure 6(b)).

Implementing the delete-wins semantics is more complex. While in the update-wins semantics it was possible to enforce the undo of the effects of the delete easily by forcing a conflict between the delete and the touch in the parent row, this is not possible in the delete-wins semantics. In this case, on a concurrent insertion of a child and the deletion of the parent row, the inserted child is not known at the time of the deletion.

To achieve the intended semantics, besides using the visibility flags, we extend read operations to check if the parent row has been deleted or not (often, the value was already read in the transaction and no additional read needs to be executed). Figure 10 shows our running example. In this case, when reading from table Albums, if the row is visible, it is necessary to check if the parent Artist is also visible.

We note that in this case, besides the visibility flags, we also maintain a version identifier for the parent rows. This is necessary to guarantee that if the element in the parent row is reinserted (after being deleted), a deleted child row is not visible again – in our example, album A1 is only visible if the parent version 1 (pvr) of artist Sam is visible.

5 Related work

Geo-replication has become a key feature in cloud storage systems, with data being replicated in multiple data centers spread around the world. The goal of geo-replication is to provide high availability and low latency, by allowing clients to access any nearby replica. To achieve these properties, a number of systems [11, 20] adopt a weak consistency model, where an update can execute in any replica, being propagated asynchronously to other replicas.

Writing correct applications under weak consistency can be complex. To address this problem, several geo-replicated storage systems [10, 28, 2] adopt a strong consistency approach. While several optimization techniques have been proposed for improving throughput [10] and latency [22], executing operations involves inter-data-center coordination, with impact on latency and availability.

Our work is closer to systems [18, 26] that provide support for both weak and strong consistency. For helping programmers decide which operation should execute under each consistency model, several tools have been proposed [18, 17, 5, 13, 24]. These tools, typically based on static analyses, impose an additional complexity to application development that is often non-trivial. In our approach, the programmer specifies the degree of concurrency allowed and which database constraints should be maintained – the system enforces the specified concurrency while trying to minimize coordination. Some systems, such as Oracle multi-master replication, allow programmers to specify how to handle conflicting updates. Our approach is more complete, by addressing a wider range of database constraints, which are key for enforcing application correctness.

Many authors have proposed to relax applications consistency and tolerate temporary inconsistencies in order to provide good performance at a planetary-scale [12, 8, 14]. We follow the same principle, however we only allow programmers to specify concurrent semantics when operations can be merged without affecting the integrity of the database. To implement the proposed concurrent semantics, we use CRDT [25] data types. These data types allow merging concurrent operations without loss of updates, which are key to implement some of the conflict resolutions that we propose.

AntidoteDB [4] is the backing store for AQL. AntidoteDB provides a key-object interface with support for CRDTs and escrow data types that we used to implement the SQL semantics. AntidoteDB ensures Parallel Snapshot Isolation. A number of systems provide equivalent semantics [27]. AQL parallel-snapshot isolation [27] with integrity invariants, in a similar way as snapshot isolation has been extended with integrity invariants [19]. Our approach for enforcing referential integrity cab be seen as a runtime version of our previous work, IPA [7], where, following a static analysis process, application operations were modified in a way that guarantees that invariants are preserved when executed under weak consistency. In this work, we apply a similar idea in runtime to SQL code. Our approach can also be seen as an extension of the approach to enforce serializability under snapshot isolation proposed by Cahill et. al. [9], be executing additional updates to force concurrency detection, and using conflict resolution policies to achieve the intended behavior.

6 Conclusion

Programmers enjoy SQL’s expressive data description and data access capabilities and consistency model. With Antidote SQL, we provide a way to allow programmers to specify when and how to relax SQL consistency, while keeping the declarative data model and enforcing database constraints, including primary, check and foreign key constraints. Antidote SQL emphasizes the need for a well-structured database scheme that includes database constraints with AQL yielding customizable concurrency semantics. With Antidote SQL, we expect relaxing consistency to become less complex when compared to other highly available, geo-replicated key-value stores, resulting in safer programs.

Antidote SQL is open-source [1] and, besides the mechanisms described in this paper, includes an indexing mechanism for primary and secondary keys. The values of the index are kept consistent with the data in the database, even in the presence of concurrent updates that are solved using the conflict-resolution policies defined for each table and table columns. The preliminary evaluation of the system [21] shows that the overhead of the mechanism to enforce foreign keys using an optimistic approach is negligible for insert and update operations, but not for delete operations. For the delete-wins policy, there is also overhead related with the execution of select operations.

Acknowledgments

We thank the anonymous reviewers for their comments that helped improving the paper. This work was partially supported by EU H2020 LightKone project (732505), and FCT/MCTES grants SFRH/ BD/87540/2012, UID/CEC/04516/2013, UID/CEC/50021/2013, Lisboa-01-0145-FEDER-032662 /PTDC/CCI-INF/32662/2017, and PTDC/CCI-INF/32081/2017.

References

  • [1] AQL source code. https://github.com/AntidoteDB/antidote_aql.
  • [2] Microsoft CosmosDB. https://docs.microsoft.com/en-us/azure/cosmos-db/. Accessed Ago-2018.
  • [3] VoltDB. https://docs.voltdb.com/, 2018. Accessed Aug-2018.
  • [4] D. D. Akkoorath, A. Z. Tomsic, M. Bravo, Z. Li, T. Crain, A. Bieniusa, N. Preguiça, and M. Shapiro. Cure: Strong semantics meets high availability and low latency. In Proc. of the 36th IEEE International Conference on Distributed Computing Systems (ICDCS 2016), Nara, Japan, June 2016.
  • [5] V. Balegas, S. Duarte, C. Ferreira, R. Rodrigues, N. Preguiça, M. Najafzadeh, and M. Shapiro. Putting consistency back into eventual consistency. In Proceedings of the Tenth European Conference on Computer Systems, EuroSys ’15, pages 6:1–6:16, New York, NY, USA, 2015. ACM.
  • [6] V. Balegas, D. Serra, S. Duarte, C. Ferreira, M. Shapiro, R. Rodrigues, and N. Preguiça. Extending eventually consistent cloud databases for enforcing numeric invariants. In Reliable Distributed Systems (SRDS), 2015 IEEE 34th Symposium on, pages 31–36, Sept 2015.
  • [7] V. Balegas, S. Sérgio Duarte, C. Ferreira, R. Rodrigues, and N. Preguiça. IPA: Invariant-preserving applications for weakly consistent replicated databases. PVLDB, 12(4):404–418, Dec. 2018.
  • [8] E. Brewer. Cap twelve years later: How the ”rules” have changed. Computer, 45(2):23–29, Feb 2012.
  • [9] M. J. Cahill, U. Röhm, and A. D. Fekete. Serializable isolation for snapshot databases. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD ’08, pages 729–738, New York, NY, USA, 2008. ACM.
  • [10] J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google’s Globally-distributed Database. In Proc. 10th USENIX Conf. on Operating Systems Design and Implementation, OSDI’12, pages 251–264, Berkeley, CA, USA, 2012. USENIX Association.
  • [11] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon’s Highly Available Key-value Store. In Proc. 21st ACM SIGOPS Symp. on Operating Systems Principles, SOSP ’07, pages 205–220, New York, NY, USA, 2007. ACM.
  • [12] H. Garcia-Molina and K. Salem. Sagas. In Proceedings of the 1987 ACM SIGMOD International Conference on Management of Data, SIGMOD ’87, pages 249–259, New York, NY, USA, 1987. ACM.
  • [13] A. Gotsman, H. Yang, C. Ferreira, M. Najafzadeh, and M. Shapiro. ’Cause I’m Strong Enough: Reasoning About Consistency Choices in Distributed Systems. SIGPLAN Not., 51(1):371–384, Jan. 2016.
  • [14] P. Helland and D. Campbell. Building on quicksand. In CIDR 2009, Fourth Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 4-7, 2009, Online Proceedings, 2009.
  • [15] A. Lakshman and P. Malik. Cassandra: A Decentralized Structured Storage System. SIGOPS Oper. Syst. Rev., 44(2):35–40, Apr. 2010.
  • [16] L. Lamport. The part-time parliament. ACM Trans. Comput. Syst., 16(2):133–169, May 1998.
  • [17] C. Li, J. Leitão, A. Clement, N. Preguiça, R. Rodrigues, and V. Vafeiadis. Automating the choice of consistency levels in replicated systems. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), pages 281–292, Philadelphia, PA, June 2014. USENIX Association.
  • [18] C. Li, D. Porto, A. Clement, J. Gehrke, N. Preguiça, and R. Rodrigues. Making Geo-replicated Systems Fast As Possible, Consistent when Necessary. In Proc. 10th USENIX Conf. on Operating Systems Design and Implementation, OSDI’12, pages 265–278, Berkeley, CA, USA, 2012. USENIX Association.
  • [19] Y. Lin, B. Kemme, R. Jiménez-Peris, M. Patiño Martínez, and J. E. Armendáriz-Iñigo. Snapshot Isolation and Integrity Constraints in Replicated Databases. ACM Trans. Database Syst., 34(2):11:1–11:49, July 2009.
  • [20] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Don’t Settle for Eventual: Scalable Causal Consistency for Wide-area Storage with COPS. In Proc. 23d ACM Symp. on Operating Systems Principles, SOSP ’11, pages 401–416, New York, NY, USA, 2011. ACM.
  • [21] P. Lopes. Antidote SQL: SQL for Weakly Consistent Databases. Master’s thesis, FCT, Universidade NOVA de Lisboa, Nov. 2018.
  • [22] F. Nawab, D. Agrawal, and A. El Abbadi. DPaxos: Managing data closer to users for low-latency and mobile applications. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD ’18, pages 1221–1236, New York, NY, USA, 2018. ACM.
  • [23] P. E. O’Neil. The escrow transactional method. ACM Trans. Database Syst., 11(4):405–430, Dec. 1986.
  • [24] S. Roy, L. Kot, G. Bender, B. Ding, H. Hojjat, C. Koch, N. Foster, and J. Gehrke. The Homeostasis Protocol: Avoiding transaction coordination through program analysis. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015, pages 1311–1326, 2015.
  • [25] M. Shapiro, N. Preguiça, C. Baquero, and M. Zawirski. Conflict-free Replicated Data Types. In Proc. 13th Int. Conf. on Stabilization, Safety, and Security of Distributed Systems, SSS’11, pages 386–400, Berlin, Heidelberg, 2011. Springer-Verlag.
  • [26] S. Sivasubramanian. Amazon DynamoDB: A seamlessly scalable non-relational database service. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, SIGMOD ’12, pages 729–730, New York, NY, USA, 2012. ACM.
  • [27] Y. Sovran, R. Power, M. K. Aguilera, and J. Li. Transactional Storage for Geo-replicated Systems. In Proc. 23d ACM Symp. on Operating Systems Principles, SOSP ’11, pages 385–400, New York, NY, USA, 2011. ACM.
  • [28] A. Verbitski, A. Gupta, D. Saha, M. Brahmadesam, K. Gupta, R. Mittal, S. Krishnamurthy, S. Maurice, T. Kharatishvili, and X. Bao. Amazon Aurora: Design considerations for high throughput cloud-native relational databases. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD ’17, pages 1041–1052, New York, NY, USA, 2017. ACM.