FixIO Documentation

import "io/atomic_file";

AtomicFile class

AtomicFile provides a way to do atomic updates of files using transactions. The class is designed as a layer on top of the regular File. It is designed for creation of various custom file formats, the only requirement is that there are 32 bytes reserved somewhere in the file format header.

Concurrent access is supported (either in read only mode where multiple readers can use the file at once, or a single writer). This is achieved using the file locks and works between different processes and (if the OS implementation is not faulty) over network file systems too.

The maximum file size is based on the page size. In the default setting the limit is 16TB (4KB * 2³²).

It is recommended to use the transaction statement when working with the transactions.

Resiliency against failures

The implementation assumes that the file system (and the underlying layers and the hardware) is competent enough to not put garbage data into the file after synchronization and that the synchronization to the disk is guaranteed (it is actually written to the disk before it returns success). Therefore the implementation guards mainly against the partial writes (at byte granulity in any order or portions). The format uses CRC-32 checksum to detect corruption in the rollback.

Rollback

The rollback is appended at the end of the file (with a hole between the original file size and the rollback to allow growth) and contains the original content of pages. If the file resizes into the rollback or beyond, the rollback is atomically moved to the new end of the file, increasing a hole between the new file size and the new rollback to allow more growth without moving the rollback again.

When there is a crash during the transaction the rollback is applied and the modified data is rewritten with its original content. After the rollback is applied the file is shrunk to remove the rollback. If there is a crash during applying of the rollback there is no problem, the process will repeat with the same result.

The rollback may contain duplicate pages, this is a result of using nested transactions. However only the first encounter of each page is restored on recovery. Each original content of the page is written first then commited into the rollback. When multiple pages are written at once, all pages except the first are pre-committed as the recovery will stop at any uncommited or partial page (eg. when the file was shrunk to remove the rollback but before the rollback was fully deactivated).

Constants

ATOMIC_FILE_HEADER_SIZE = 32: This is the size of the required header and is always 32 bytes.

Initialization

static function open(file: File, header_offset: Integer): AtomicFile static function open(file: File, header_offset: Integer, page_size: Integer): AtomicFile: Creates a new instance of the atomic file with a given physical file. The header offset must be always the same for the same file. Page size must be power of two (the default value is 4096). The file is brought into a clean state (rollback applied if present, the header written if the file is empty). The page size can be changed to a different value (for example to use smaller page when a lot of small changes are made, or bigger for big continuous overwrites). Beware of the maximum file size when using smaller page sizes.
function close(): Closes the file, rollbacks any uncommited transactions.
function register_cache_handler(flush_func, discard_func, data): Registers a cache handler. The flush function is called whenever unwritten data needs to be written (before entering a nested transaction and before the outmost commit) and the discard function is called whenever both the cached and unwritten data (if it wasn't flushed) needs to be discarded (after the outmost transaction and before rollback for write transactions). Both functions take a single common parameter (data). The handlers are called in the opposite order than they were registered allowing to update the cached data from higher to lower abstraction layers.

Transaction functions

function begin() function begin(write: Boolean) function begin(write: Boolean, timeout: Integer): Boolean: Begins a new transaction. You can specify if it's read or write transaction (the default is write). You can use nested transactions, however you can't upgrade from a read transaction to a write transaction. If you need this to preprocess some data while allowing a concurrent read access, do it in the read transaction then commit it and begin a new write transaction, verify that the file wasn't changed in an incompatible way (in such case just reprocess the data again) and then you can write the results.
function commit(): Commits the transaction, making the changes permanent. This is required to be called even for read transactions.
function rollback(): Rollbacks the transaction, clearing any changes made in the transaction.
function in_transaction(): Boolean: Returns true when there is an active transaction.
function in_write_transaction(): Boolean: Returns true when there is an active write transaction.
function set_on_rollback(container, key, value): Sets the original value in a container (array or hash) with a given key (or index) on rollback. In case of an array the key can be -1 to set the length of the array. For hash tables, if the value is null the entry is removed. This function is provided to allow to reverse changes to the in-memory structures related to the transaction.
function call_on_rollback(func, data, param): In case the set_on_rollback function is insufficient, you can register a call of a two parameter function (data is the first argument, and the param is the second) on rollback.

Data access functions

function read(offset: Long, buf: Byte[]) function read(offset: Long, buf: Byte[], off: Integer, len: Integer): Reads a portion of the file at a given offset. Reading past the file length is an error.
function write(offset: Long, buf: Byte[]) function write(offset: Long, buf: Byte[], off: Integer, len: Integer): Writes a buffer (or it's portion) to the file at a given offset. Writing past the file length will enlarge the file (zero length will trigger this too).
function get_length(): Long: Returns the (logical) length of the file.
function set_length(new_size: Long): Sets a new length of the file. The file cannot be made smaller than the header.