Atomic storage API #121

kevin-harrison · 2023-10-29T21:20:41Z

Please make sure these boxes are checked, before submitting a new PR.

You ran the local CI checker with ./check.sh with no errors
You reference which issue is being closed in the PR text (if applicable)
You updated the OmniPaxos book (if applicable)

Issues

Fix #97, Fix #49

Breaking Changes

TheStorage trait has been reworked to facilitate atomic writes and allow for arbitrary crash-recovery. The indices passed to these functions now refer to the indices of the entry log as if no trimming or snapshotting has happened.
PersistentStorage and its config structs have been reworked to use RocksDB and implement the new Storage trait. The dependence on commitlog is therefore gone, addressing issue Implement better trim functionality #49.
All indices are now of type usize instead of u64. This includes the indices used in the Storage trait, Message structs, and Omnipaxos public functions.
Both Storage::get_promise implementations in the omnipaxos_storage module now return None instead of the default Ballot when there is no promise ballot saved in storage.
The batch_accept feature flag has been removed and is now the default behavior.
The Promise, AcceptSync, and AcceptDecide message struct fields have been changed.

Other Changes

MemoryStorage now implements the new Storage trait but its implementation of write_atomically does not guarantee atomicity (acceptable since memory can't persist state through crash-recovery).
Batched entries are now correctly flushed when reconfiguring and when deciding entries.
The syncing behavior in the the prepare phase in now consistent between both leaders and followers.

… to write batches. Removes batch_accept feature flag.

haraldng

Just some initial comments. I'll do a more detailed review after these comments are fixed.

For next time, please don't do refactorings before the actual feature is approved. E.g., here the changes for atomic storage wouldn't have been bloated with the changes for replacing u64 with usize (which is trivial).

omnipaxos/src/sequence_paxos/follower.rs

omnipaxos/src/sequence_paxos/mod.rs

omnipaxos/src/sequence_paxos/follower.rs

omnipaxos/src/storage.rs

…tries on Decide messages.

haraldng · 2023-11-02T09:18:51Z

After our discussion yesterday, I looked at the current code to see how we can keep as much as possible of the current batching and at the same time introduce transaction for atomic storage. I think what we need is a struct InternalStorageTxn for InternalStorage. This is to enable the clean transaction object API in the Sequence Paxos code. When we handle the InternalStorageTxn in InternalStorage, we will know which entries to actually flush. And based on that we will create the real transaction for storage. So something like this:

// transaction object for InternalStorage
struct InternalStorageTxn {
    state_ops: Vec<StateOp>, // StateOp is an enum for modifying accepted_round, decided_idx, etc.
    append_op: AppendOp,       // AppendOp is an enum for append, append_on_prefix, etc.
    snapshot: Option<(usize, S)>  // snapshot index and snapshot
}

// handle_acceptsync() in SequencePaxos
let state_ops = vec![StorageOp::AcceptedRound(4), StorageOp::DecidedIdx(10)];
let append_op = AppendOp::AppendOnPrefix(accsync.sync_idx, accsync.suffix);
let internal_txn = InternalStorageTxn {state_ops, append_ops, ..};
self.internal_storage.do_txn(internal_txn);

// do_txn(it: InternalStorageTxn) in InternalStorage
let append_res = match it.append_op {
    Append(entries, do_batching) if do_batching => self.state_cache.append_entries(entries),
    Append(entries, do_batching) if !do_batching => self.state_cache.append_entries_without_batching(entries),
    // ... AppendOnPrefix etc.
};
let mut txn = StorageTxn::new();  // transaction object for Storage
txn.insert(it.state_ops);
if let Some(entries_to_be_flushed) = append_res {
    txn.insert(StorageOp::Append(entries_to_be_flushed));
}
if let Some((idx, snapshot)) = it.snapshot {
    txn.insert(StorageOp::Snapshot(idx, snapshot));
}
self.state_cache.real_log_len = self.storage.flush(txn)?;
Ok(self.get_accepted_idx())

…yncing, trimming, and snapshotting.

omnipaxos/src/sequence_paxos/follower.rs

omnipaxos/src/sequence_paxos/leader.rs

omnipaxos/src/storage.rs

InternalStorage. Hides unicache types in structs.

haraldng

Good job! The design looks really good and seems to have made things a lot cleaner. The comments are mostly related to the code structure. Please fix them all or comment on places you have other ideas.

omnipaxos/src/messages.rs

haraldng · 2023-11-07T14:54:52Z

omnipaxos/src/sequence_paxos/follower.rs

+ // Flush any pending writes
+ self.internal_storage.flush_batch().expect(WRITE_ERROR_MSG);


There is nothing wrong with this, but it could be optimized.

If we in general only send Accepted after flushing then actually this is not strictly necessary. (any chosen entry must be flushed among a majority and if we haven't flushed it, then we will get it in the AcceptSync).

To prevent flushing stuff that will anyway get overwritten, we should perhaps do some checks here first. The most basic check is if the leader is more updated than us, we don't flush. There are probably more sophisticated checks we can do. For now I think we should do this basic check and leave other optimizations for the future.

On a second thought, let's leave this as it is and we can create an issue and fix it in the future. So we don't add more complexity to this PR which already has many changes.

omnipaxos/src/sequence_paxos/follower.rs

omnipaxos/src/storage.rs

omnipaxos/src/messages.rs

omnipaxos/tests/config/test.toml

omnipaxos_storage/Cargo.toml

omnipaxos_storage/src/persistent_storage.rs

haraldng

Mostly just add more comments/docs. Can now go ahead and refactor the storage file.

omnipaxos/src/sequence_paxos/follower.rs

omnipaxos/src/sequence_paxos/leader.rs

omnipaxos/src/sequence_paxos/mod.rs

omnipaxos/src/storage.rs

omnipaxos/src/util.rs

omnipaxos/src/lib.rs

omnipaxos/src/sequence_paxos/follower.rs

omnipaxos/src/storage.rs

omnipaxos/tests/docs_integration.rs

haraldng

Overwrite instead of delete where possible in PersistentStorage.

omnipaxos_storage/src/persistent_storage.rs

soruh · 2023-12-24T16:40:45Z

All indices are now of type usize instead of u64. This includes the indices used in the Storage trait, Message structs, and Omnipaxos public functions.

I couldn't find the reason for this anywhere. It seems unfortunate to effectively constrain omnipaxos to 64-bit systems.
Using 32-bit ids and assuming 150_000 ops/s (taken from the EuroSys'23 paper quoted in the readme). Would give approximately 8 hours until the 32-bit id is exhausted...

Am I missing something obvious / what is the reason for this change?

haraldng · 2023-12-27T06:51:43Z

The change is not performance-related but rather for API reasons, we wanted reading from the omnipaxos log to be similar to the Rust Vec API.

Since it is now usize, the user should be able to use it with any compatible system.

soruh · 2023-12-27T08:10:02Z

That makes sense ergonomics wise, my concern would be that a system with a 32bit architecture and one with a 64bit architecture could not be part of the same cluster. Or, for that matter any 32 bit system could exhause the 32bit log index in reasonable time.

kevin-harrison added 10 commits October 24, 2023 22:02

Removes read+write storage functions.

9a7c5b7

Storage refers to entries consistently post and pre trim.

07e0c4a

Make storage API and Option<Ballot> returns consistent.

8b5bf52

PersistentStorage now uses RocksDB.

0dcaa2f

Fixed decided_idx off by 1 error.

d54525e

Switch real_log_len to accepted_idx in InternalStorage.

ade4889

Cleaned up index usize and NodeId types.

0c3c21c

cargo fmt

bb8e059

More type cleanup.

89ae8c1

Adds write_batch API to storage. Sequence Paxos almost fully switched…

7bb89cf

… to write batches. Removes batch_accept feature flag.

haraldng requested changes Oct 30, 2023

View reviewed changes

Combined buffered StorageOps and batched entries solution. Flushes en…

40b64b5

…tries on Decide messages.

kevin-harrison added 2 commits November 2, 2023 20:01

Reintroduce batching code from master. Minimize transaction code to s…

b459611

…yncing, trimming, and snapshotting.

Handle AcceptDecide recovery-proof order of operations.

a1ed9f5

haraldng requested changes Nov 3, 2023

View reviewed changes

Unifies leader and follower syncing. Adds atomic log sync to

baf8307

InternalStorage. Hides unicache types in structs.

kevin-harrison requested a review from haraldng November 7, 2023 11:45

Fix Value debug trait. Cargo fmt.

453cc64

kevin-harrison marked this pull request as ready for review November 7, 2023 13:07

kevin-harrison changed the title ~~Atomic storage API (WIP)~~ Atomic storage API Nov 7, 2023

haraldng requested changes Nov 7, 2023

View reviewed changes

Addresses PR comments.

6df913a

kevin-harrison requested a review from haraldng November 8, 2023 15:59

haraldng requested changes Nov 9, 2023

View reviewed changes

kevin-harrison added 4 commits November 10, 2023 11:35

Follower set_decided_idx changes. Updated comments/docs.

97f817f

Separates Storage and InternalStorage.

4992c7a

Fixes batching test.

39f1d92

Gets compilation working with different feature flags. Updates book.

98d0921

kevin-harrison requested a review from haraldng November 10, 2023 16:47

haraldng requested changes Nov 12, 2023

View reviewed changes

Creates storage module. Updates docs/book. Clippy/fmt cleanup.

7dc2c76

kevin-harrison requested a review from haraldng November 13, 2023 14:02

haraldng requested changes Nov 13, 2023

View reviewed changes

PersistentStorage improvements.

166286f

haraldng requested changes Nov 14, 2023

View reviewed changes

omnipaxos_storage/src/persistent_storage.rs Outdated Show resolved Hide resolved

omnipaxos_storage/src/persistent_storage.rs Outdated Show resolved Hide resolved

kevin-harrison added 2 commits November 14, 2023 13:29

Fixes prefix bug and clarifies Storage index bounds.

0acb4f8

Fixes initial accetped_round bug.

094e2c8

kevin-harrison requested a review from haraldng November 14, 2023 12:56

haraldng approved these changes Nov 14, 2023

View reviewed changes

Merge branch 'master' into atomic-storage-api

3189c37

kevin-harrison requested a review from haraldng November 14, 2023 15:47

haraldng approved these changes Nov 14, 2023

View reviewed changes

haraldng merged commit 45888af into haraldng:master Nov 14, 2023
6 checks passed

haraldng mentioned this pull request Nov 14, 2023

Atomic storage API rework #100

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Atomic storage API #121

Atomic storage API #121

kevin-harrison commented Oct 29, 2023 •

edited

Loading

haraldng left a comment

haraldng commented Nov 2, 2023

haraldng left a comment

haraldng Nov 7, 2023

haraldng Nov 8, 2023

haraldng left a comment

haraldng left a comment

soruh commented Dec 24, 2023

haraldng commented Dec 27, 2023

soruh commented Dec 27, 2023

		// Flush any pending writes
		self.internal_storage.flush_batch().expect(WRITE_ERROR_MSG);

Atomic storage API #121

Atomic storage API #121

Conversation

kevin-harrison commented Oct 29, 2023 • edited Loading

Issues

Breaking Changes

Other Changes

haraldng left a comment

Choose a reason for hiding this comment

haraldng commented Nov 2, 2023

haraldng left a comment

Choose a reason for hiding this comment

haraldng Nov 7, 2023

Choose a reason for hiding this comment

haraldng Nov 8, 2023

Choose a reason for hiding this comment

haraldng left a comment

Choose a reason for hiding this comment

haraldng left a comment

Choose a reason for hiding this comment

soruh commented Dec 24, 2023

haraldng commented Dec 27, 2023

soruh commented Dec 27, 2023

kevin-harrison commented Oct 29, 2023 •

edited

Loading