CheckSync: Using Runtime-Integrated Checkpoints to Achieve High Availability
CheckSync provides applications with high availability via runtime-integrated checkpointing. This allows CheckSync to take checkpoints of a process running in a memory-managed language (Go, for now), which can be resumed on another machine after a failure. CheckSync uses the runtime to checkpoint only the process' live memory, doing without requiring significant changes to applications. CheckSync maintains the ease of use provided by virtual machines for the applications it supports without requiring that an entire virtual machine image be snapshotted. Because CheckSync captures only the memory used by an application, it produces checkpoints that are smaller (by an order of magnitude) than virtual machine snapshots if the memory footprint of the application is relatively small compared to the state of the rest of the operating system. Additionally, when running go-cache, a popular in-memory key/value store, CheckSync reduces throughput by only 12 throughput loss when using go-cache's snapshot functionality, the 45 using CRIU, and the 68
READ FULL TEXT