Skip to content
/ SOE Public

NoSQL client-server with transactions, iterators, vector operations, asynchronous future promise APIs and many more

Notifications You must be signed in to change notification settings

janlt/SOE

Repository files navigation

SOE

C++ API Overview

Soe is a distributed and persistent key-value store. Soe is also a NoSQL database providing a mechanism for storage and retrieval of data, which is modeled in terms other than relational schema.

Soe provides infrastructure to consolidate and unify various heterogeneous KV stores, RDBMs, memory-only databases, etc., under a distributed umbrella with a unified set of APIs.

Building and installing

The prerequisites for Fedora 28+ before building SOE are the following.

 boost* 1.66+
 json-c 0.13+
 json-glib 1.4+
 jsoncpp 1.8+
 rocksdb 5.7+

Then build it

 make
 make install

ultimately the install step can be skipped and the environment can be set up instead:

 . envinit.sh

SOE server can be installed as a service using soemetadbsrv/files/usr/lib/systemd/system/soemetadbsrv.service as a template.

Steps to run some tests:

 Start up soemetadbsrv from the build folder:
 
 soemetadbsrv/bin/soemetadbsrv -u <YOUR_USER_NAME>
 or
 sudo soemetadbsrv/bin/soemetadbsrv -u <ANOTHER_USER_NAME>
   
 Create a store (database) using soe_test utility.

 From the build folder run:
 
 soeapi/c_test/bin/soe_test -x <YOUR_USER_NAME> -o <CLUSTER> -z <SPACE> -c <STORE> -A -m 2
   
 then insert a bunch of records in it:
 soeapi/c_test/bin/soe_test -x <YOUR_USER_NAME> -o <CLUSTER> -z <SPACE> -c <STORE> -C -n 100 -N 1000 -k "MY_KEY_" -m 2
   
 finally query it:
 soeapi/c_test/bin/soe_test -x <YOUR_USER_NAME> -o <CLUSTER> -z <SPACE> -c <STORE> -E -k "" -e "" -m 2
 
 The integrity_check utility can be used to exercise some asynchronous APIs:
 
 soeapi/soeintegrity_test/bin/soe_integrity_test -x <YOUR_USER_NAME> -o GG -z GG -c YY -X 8 -n 100 -m 2
 
 The above command inserted 100 records with predefined key in one vector async call. The results can be 
 verified using soe_test:
 
 soeapi/c_test/bin/soe_test -x <YOUR_USER_NAME> -o GG -z GG -c YY -E -k "" -e "" -m 2</p>

Asynchronous APIs

Asynchronous APIs rely on C++ Future/Promise class libraries comprising a class hierarchy. Soe futures can be created and destroyed only using session API. Sessions provide context for future's invocation and handling. Future's constructors and destructors are protected thus preventing users from creating futures outside of sessions.

There are different future classes, depending on the requested operation. Single i/o requests, i.e. GetAsync/PutAsync/DeleteAsync/MergeAsync have corresponding future types as follows: GetAsync -> GetFuture PutAsync -> PutFuture DeletetAsync -> DeleteFuture MergeAsync -> MergeFuture

For vectored requests, i.e. GetSetAsync/PutSetAsync/DeleteSetAsync/MergeSetAsync the return future types are the following: GetSetAsync -> GetSetFuture PutSetAsync -> PutSetFuture DeletetSetAsync -> DeleteSetFuture MergeSetAsync -> MergeSetFuture

Once a future is created via session's async API, it can be used later on to obtain the status and the result, i.e. key(s) and value(s). Future's API provides Get() method to do that. Get() method will synchronize future with the return status and result, so potentially it's a blocking call. Get() may block the caller if a future has not yet received its result. On vectored requests, Get() will synchronize its future with the results of all the elements of the vector. For example, if the input vector for PutSetAsync() contains 100 elements, the PutSetFuture will be synchronized with the statuses of all individual Put requests, i.e. Get() will return when all of the elements of the input vector have been written and their statuses communicated back to the future object. Like in synchronous vectored requests, where an element of the input vector may contain a duplicate or non-existing key, a vectored future will become available upon encountering the first eDuplicateKey or eNotFoundKey, provided the boolean flag fail_on_first is set to true.

Class API Overview

Class name

Description

Session

Gives access to {"Cluster", "Space", "Store", "Provider"} space. Users can open, create, destroy and manipulate the contents of KV stores through Session API.

Group

Allows grouping of multiple operations to the same store in one batch, i.e. a Group object holds a sequence of edits to be made to a store. Invocation of Write method will write the contents to a store.

Group s primarily used to speed up bulk updates.

Transaction

Transactions are used to write multiple items in a transactional fashion. That ensures not only atomicity and isolation but also resolves potential conflicts that may occur when two or more transactions try to update the same key(s).

Duple

Duple is a wrapper for {Pointer, Size} tuple. Duple's storage pointed to by "Pointer" is owned by the caller, so the caller is responsible for managing it.

DisjointSubset

Subsets are used to create lightweight groups within a store. DisjointSubset is a group of items within a store with unique keys.

SubsetIterator

Subsets can be iterated upon using SubsetIterator. SubsetIterators are created with a traversal direction and support most common iterator methods.

SessionIterator

SessionIterator allows iterating over the contents of the entire store. Typically, when creating a SessionIterator user specifies start and end key.

Futurable

Base abstract class for all of the SOE Future class hierarchies.

SimpleFuture

Base class for one operation Future classes.

PutFuture

Class used to put one key-value pair asynchronously in a store.

GetFuture

Class used to get one key-value pair asynchronously from a store.

MergeFuture

Class used to merge one key-value pair asynchronously in a store..

DeleteFuture

Class used to delete one key-value pair asynchronously from a store.

SetFuture

Base class for vector Future classes, i.e. when a vector<> of key-value pairs is wriiten/read/merged or deleted.

PutSetFuture

Class used to put a vector of key-value pairs asynchronously in a store.

GetSetFuture

Class used to get a vector of key-value pairs asynchronously from a store.

MergeSetFuture

Class used to merge a vector of key-value pairs asynchronously in a store..

DeleteSetFuture

Class used to delete a vector of key-value pairs asynchronously from a store.

BackupEngine

BackupEngine is used to create a backup engine, i.e. an object that can be used to create or destroy a store backup.

SnapshotEngine

SnaphotEngine is used to create a snapshot engine, i.e. an object that can be used to create or destroy a store snapshot.

Test application

Example application "soe_test" is located in: soe/soeapi/c_test

To build test application simply run "make release" in test directory

Test application options:

Usage bin/c_test_soe [options]
  -o, --cluster-name             Soe cluster name
  -z, --space-name               Soe space name
  -c, --store-name               Soe store name
  -l, --transactional            Soe transactional store (default 0 - non-transactional)
  -n, --num-ops                  Number of ops (default = 1)
  -k, --key                      Key (default key_1)
  -e, --end-key                  End key (default key_1)
  -a, --hex-key                  Key specified in hex format, e.g. 0F45E8CD08C5C5C5
  -t, --iterator-dir             Iterator dir (0 - forward(default), 1 - reverse)
  -u, --default-first-iter-arg   Use iterator's First() args defaults
  -v, --value                    Value (default value_1)
  -d  --snap-backup-subset-name  Snapshot, backup or subset name (default snapshot_1, backup_1, subset_1)
  -f  --snap-back-id             Snapshot or backup id (default 0)
  -i, --sync-type                Sync type (0 - default, 1 - fdatasync, 2 - async WAL)
  -m, --provider                 Store provider (0 - RCSDBEM, 1 - KVS, 2 - METADBCLI, 3 - POSTGRES, 4 - DEFAULT provider)
  -j, --debug                    Debug (default 0)
  -g, --write-group              Do writes as group write (batching) instead of individual writes (1 - Put, 2 - Merge)
  -w, --write-transaction        Do writes as transaction (1,2,4,5 - commit, 3 - rollback, 1,3,4,5 - Put, 2 - Merge, 4 - Get, 5 - Delete)
  -s, --delete-from-group        Do delete from group in the middle of batching
  -A, --create-store             Create cluster/space/store
  -B, --create-subset            Create subset
  -C, --write-store              Write store num_ops (default 1) KV pairs
  -D, --write-subset             Write subset num_ops (default 1) KV pairs
  -O, --merge-store              Merge in store num_ops (default 1) KV pairs
  -P, --merge-subset             Merge in subset num_ops (default 1) KV pairs
  -E, --read-store               Read all KV items from store
  -Q  --repair-store             Repair store
  -S  --write-store-async        Write store async num_ops (default 1) KV pairs
  -W  --read-store-async         Read KV items from store async num_ops (defualt 1)
  -Y  --merge-store-async        Merge in store num_ops (default 1)
  -Z  --delete-store-async       Delete KV items async (defualt 1)
  -R  --traverse-name-space      Traverse name space (print all clusters/spaces/stores/subsets that are there)
  -F, --read-subset              Read all KV items from subset
  -G, --read-kv-item-store       Read one KV item from store
  -H, --read-kv-item-subset      Read one KV item from subset
  -I, --delete-kv-item-store     Delete one KV item from store
  -J, --delete-kv-item-subset    Delete one KV item from subset
  -K, --destroy-store            Destroy store
  -L, --destroy-subset           Destroy subset
  -M, --hex-dump                 Do hex dump of read records (1 - hex print, 2 - pipe to stdout)
  -X, --regression               Regression test
                                 1 - create/destroy stores multi-threaded (default)
                                 2 - create/destroy stores/subsets multi-threaded
                                 3 - create/destroy and write/read different name stores/subsets multi-threaded
                                 4 - create/destroy and write/read same name single store/subset multi-threaded
                                 5 - create/destroy and write/read same name single store multi-threaded
                                 6 - performance test
                                 7 - single session multiple threads performance test
                                 8 - single session loop(open/create/write/read/close)
                                 9 - create/write/destroy different subset same store multi-threaded
                                10 - create open/write/read/close same name
  -N, --data-size                Set data size to an arbitrary value (will override -v)
  -T, --sleep-usecs              Sleep usecs between tests
  -y, --regression-loop-cnt      Regression loop count
  -r, --regression-thread-cnt    Regression thread count
  -p, --no-print                 No printing key/value (default print)
  -x, --user-name                Run unser user-name
  -h, --help                     Print this help
Selected option description:

-m (0 - RCSDBEM, 1 - KVS, 2 - METADBCLI, 3 - POSTGRES, 4 - DEFAULT provider)

Specifies store provider for a session. If left out it'll default to value configured in soeprovider.cpp
RCSDBEM - Embedded RocksDB (store access is done directly from the caller's process)
KVS - Optional light-weight embedded key-value store
METADBCLI - Server RocksDB (store access is done by the soemetadbsrv 

-N 
Value size in bytes. 
On the output, only up to first 128 chars of values will be printed. It doesn't mean that the values have been
truncated. -M enables full hex fump. -p turns off printing.

-X [1..6]
These are regression tests used as a quick way to verify the basic functionality.
Example invocations:

1. Create a bunch of records in a subset as follows {"l8", "s8", "c8', "sub"}, where
"l8" - cluster name
"s8" - space name
"c8" - container name
"sub" - subset name

bin/c_test_soe -o l8 -z s8 -c c9 -A bin/c_test_soe -o l8 -z s8 -c c9 -d sub -B
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "FIRS 0001" -v "DATA_FIRS_0001"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "FIRS 0002" -v "DATA_FIRS_0002"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "FIRS 0011" -v "DATA_FIRS_0011"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "FIRS 0012" -v "DATA_FIRS_0012"

bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "SECO 0001" -v "DATA_SECO_0001"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "SECO 0002" -v "DATA_SECO_0002"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "SECO 0011" -v "DATA_SECO_0011"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "SECO 0012" -v "DATA_SECO_0012"

bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "THIR 0001" -v "DATA_THIR_0001"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "THIR 0002" -v "DATA_THIR_0002"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "THIR 0011" -v "DATA_THIR_0011"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "THIR 0012" -v "DATA_THIR_0012"

bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "FOUR 0001" -v "DATA_FOUR_0001"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "FOUR 0002" -v "DATA_FOUR_0002"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "FOUR 0011" -v "DATA_FOUR_0011"
bin/c_test_soe -o l8 -z s8 -c c9 -d sub -D -k "FOUR 0012" -v "DATA_FOUR_0012"

2. Get all the key-value pairs by using an unbounded range

bin/c_test_soe -o l8 -z s8 -c c9 -d sub -F -k "" -e ""
key(9,FIRS 0001) value(14,DATA_FIRS_0001)
key(9,FIRS 0002) value(14,DATA_FIRS_0002)
key(9,FIRS 0011) value(14,DATA_FIRS_0011)
key(9,FIRS 0012) value(14,DATA_FIRS_0012)
key(9,FOUR 0001) value(14,DATA_FOUR_0001)
key(9,FOUR 0002) value(14,DATA_FOUR_0002)
key(9,FOUR 0011) value(14,DATA_FOUR_0011)
key(9,FOUR 0012) value(14,DATA_FOUR_0012)
key(9,SECO 0001) value(14,DATA_SECO_0001)
key(9,SECO 0002) value(14,DATA_SECO_0002)
key(9,SECO 0011) value(14,DATA_SECO_0011)
key(9,SECO 0012) value(14,DATA_SECO_0012)
key(9,THIR 0001) value(14,DATA_THIR_0001)
key(9,THIR 0002) value(14,DATA_THIR_0002)
key(9,THIR 0011) value(14,DATA_THIR_0011)
key(9,THIR 0012) value(14,DATA_THIR_0012)

3. The results are in reverse order when using reverse iterator (-t 1)

bin/c_test_soe -o l8 -z s8 -c c9 -d sub -F -k "" -e "" -t 1
key(9,THIR 0012) value(14,DATA_THIR_0012)
key(9,THIR 0011) value(14,DATA_THIR_0011)
key(9,THIR 0002) value(14,DATA_THIR_0002)
key(9,THIR 0001) value(14,DATA_THIR_0001)
key(9,SECO 0012) value(14,DATA_SECO_0012)
key(9,SECO 0011) value(14,DATA_SECO_0011)
key(9,SECO 0002) value(14,DATA_SECO_0002)
key(9,SECO 0001) value(14,DATA_SECO_0001)
key(9,FOUR 0012) value(14,DATA_FOUR_0012)
key(9,FOUR 0011) value(14,DATA_FOUR_0011)
key(9,FOUR 0002) value(14,DATA_FOUR_0002)
key(9,FOUR 0001) value(14,DATA_FOUR_0001)
key(9,FIRS 0012) value(14,DATA_FIRS_0012)
key(9,FIRS 0011) value(14,DATA_FIRS_0011)
key(9,FIRS 0002) value(14,DATA_FIRS_0002)
key(9,FIRS 0001) value(14,DATA_FIRS_0001)

4. Query for specific range

bin/c_test_soe -o l8 -z s8 -c c9 -d sub -F -k "SECO" -e ""
key(9,SECO 0001) value(14,DATA_SECO_0001)
key(9,SECO 0002) value(14,DATA_SECO_0002)
key(9,SECO 0011) value(14,DATA_SECO_0011)
key(9,SECO 0012) value(14,DATA_SECO_0012)
key(9,THIR 0001) value(14,DATA_THIR_0001)
key(9,THIR 0002) value(14,DATA_THIR_0002)
key(9,THIR 0011) value(14,DATA_THIR_0011)
key(9,THIR 0012) value(14,DATA_THIR_0012)

bin/c_test_soe -o l8 -z s8 -c c9 -d sub -F -k "SECO" -e "THIR 0000"
key(9,SECO 0001) value(14,DATA_SECO_0001)
key(9,SECO 0002) value(14,DATA_SECO_0002)
key(9,SECO 0011) value(14,DATA_SECO_0011)
key(9,SECO 0012) value(14,DATA_SECO_0012)



C++ Test application to test data integrity and futures

Example application "integrity_test" is located in: soe/soeapi/integrity_test

To build test application simply run "make release" in integrity_test directory

integrity_test application options:

Usage integrity_test [options]
  -o, --cluster-name             Soe cluster name
  -z, --space-name               Soe space name
  -c, --store-name               Soe store name
  -l, --transactional            Soe transactional store (default 0 - non-transactional)
  -n, --num-ops                  Number of ops (default = 1)
  -W, --num-loops                Number rof loop count in each regression (default =1)
  -k, --key                      Key (default key_1)
  -e, --end-key                  End key (default key_1)
  -t, --iterator-dir             Iterator dir (0 - forward(default), 1 - reverse)
  -u, --default-first-iter-arg   Use iterator's First() args defaults
  -v, --value                    Value (default value_1)
  -d  --snap-backup-subset-name  Snapshot, backup or subset name (default snapshot_1, backup_1, subset_1)
  -f  --snap-back-id             Snapshot or backup id (default 0)
  -i, --sync-type                Sync type (0 - default, 1 - fdatasync, 2 - async WAL)
  -m, --provider                 Store provider (0 - RCSDBEM, 1 - KVS, 2 - METADBCLI, 3 - POSTGRES, 4 - DEFAULT provider)
  -j, --debug                    Debug (default 0)
  -M, --hex-dump                 Do hex dump of read records (1 - hex print, 2 - pipe to stdout)
  -X, --regression               IntegrityRegression test
                                 1 - IntegrityRegression1 open/create/delete session/subset/sess_it/sub_it (default)
                                 2 - IntegrityRegression2 write/read/verify session/subset records
                                 3 - IntegrityRegression3 not yet defined
                                 4 - AsyncPut() test
                                 5 - AsyncGet() test
                                 6 - AsyncDelete() test
                                 7 - AsyncMerge() test
                                 8 - AsyncPutSet() test
                                 9 - AsyncGetSet() test
                                 10 - AsyncDeleteSet() test
                                 11 - AsyncMergeSet() test
                                 12 - AsyncConcurrentPutSet() test
                                 13 - AsyncConcurrentGetSet() test
                                 14 - AsyncConcurrentDeleteSet() test
                                 15 - AsyncConcurrentMergeSet() test
                                 16 - AsyncSyncMixedSetOpenClose() test
                                 17 - AsyncSyncMixedSet() test
                                 18 - AsyncPrematureCloseSet() test
                                 19 - mixed AsyncPut/AsyncGet() multi-threaded loop test
  -N, --data-size                Set data size to an arbitrary value (will override -v)
  -T, --sleep-usecs              Sleep usecs between tests
  -b, --random-keys              Generate random instead of consecutive keys
  -y, --regression-loop-cnt      Regression loop count
  -r, --regression-thread-cnt    Regression thread count
  -s, --wait_for_key             Wait for key press before exiting out
  -p, --no-print                 No printing key/value (default print)
  -x, --user-name                Run unser user-name
  -h, --help                     Print this help

Soecapi C API Overview

soecapi.hpp
soecapi.cpp
soesessioncapi.cpp
soesessiongroupcapi.cpp
soesessioniteratorcapi.cpp
soesessiontransactioncapi.cpp
soesubsetiteratorcapi.cpp
soesubsetscapi.cpp

Type name

Description

SessionHND

Session handle - soecapi.hpp

Gives access to {"Cluster", "Space", "Store", "Provider"} space. Users can open, create, destroy and manipulate the contents of KV stores through Session API.

GroupHND

Group handle - soecapi.hpp

Allows grouping of multiple operations to the same store in one batch, i.e. a Group object holds a sequence of edits to be made to a store. Invocation of Write method will write the contents to a store.

Group s primarily used to speed up bulk updates.

TransactionHND

Transcation handle - soecapi.hpp

Transactions are used to write multiple items in a transactional fashion. That ensures not only atomicity and isolation but also resolves potential conflicts that may occur when two or more transactions try to update the same key(s).

CDuple

CDupleVector

CDuplePair

CDuplePairVector

CDuple, CDupleVector, CDuplePair, CDuplePairVector are C style counterparts for C++ std containers. Definitions reside in soecapi.hpp

Duple is a wrapper for {Pointer, Size} tuple. Duple's storage pointed to by "Pointer" is owned by the caller, so the caller is responsible for managing it.

DisjointSubsetHND

DisjointSubset handle - soecapi.hpp

Subsets are used to create lightweight groups within a store. DisjointSubset is a group of items within a store with unique keys.

SubsetIteratorHND

SubsetIterator handle - soecapi.hpp

Subsets can be iterated upon using SubsetIterator. SubsetIterators are created with a traversal direction and support most common iterator methods.

SessionIteratorHND

SessionIterator handle - soecapi.hpp

SessionIterator allows iterating over the contents of the entire store. Typically, when creating a SessionIterator user specifies start and end key.

FutureHND

Future handle for single key-value pair operations - soefuturescapi.hpp

SetFutureHND

Future handle for vector of key-value pairs operations - soefuturescapi.hpp

FutureHND

Future handle for single key-value pair operations - soefuturescapi.hpp

BackupEngineHND

Backup engine handle - soecapi.hpp

BackupEngine is used to create a backup engine, i.e. an object that can be used to create or destroy a store backup.

SnapshotEngineHND

Snapshot engine handle - soecapi.hpp

SnaphotEngine is used to create a snapshot engine, i.e. an object that can be used to create or destroy a store snapshot.

Example C++ application

Example application "c_test_soe" is located in: soe/soeapi/c_test

To build test application simply run "make release" in c_test directory

Test application options:

Usage bin/c_test_soe [options]
  -o, --cluster-name             Soe cluster name
  -z, --space-name               Soe space name
  -c, --store-name               Soe store name
  -l, --transactional            Soe transactional store (default 0 - non-transactional)
  -n, --num-ops                  Number of ops (default = 1)
  -k, --key                      Key (default key_1)
  -e, --end-key                  End key (default key_1)
  -t, --iterator-dir             Iterator dir (0 - forward(default), 1 - reverse)
  -u, --default-first-iter-arg   Use iterator's First() args defaults
  -v, --value                    Value (default value_1)
  -d  --snap-backup-subset-name  Snapshot, backup or subset name (default snapshot_1, backup_1, subset_1)
  -f  --snap-back-id             Snapshot or backup id (default 0)
  -i, --sync-type                Sync type (0 - default, 1 - fdatasync, 2 - async WAL)
  -m, --provider                 Store provider (0 - KVS, 1 - RCSDBEM(default), 2 - METADBCLI, 3 - POSTGRES)
  -j, --debug                    Debug (default 0)
  -A, --create-store             Create cluster/space/store
  -B, --create-subset            Create subset
  -C, --write-store              Write store num_ops (default 1) KV pairs
  -D, --write-subset             Write subset num_ops (default 1) KV pairs
  -E, --read-store               Read all KV items from store
  -F, --read-subset              Read all KV items from subset
  -G, --read-kv-item-store       Read one KV item from store
  -H, --read-kv-item-subset      Read one KV item from subset
  -I, --delete-kv-item-store     Delete one KV item from store
  -J, --delete-kv-item-subset    Delete one KV item from subset
  -K, --destroy-store            Destroy store
  -L, --destroy-subset           Destroy subset
  -M, --hex-dump                 Do hex dump of read records (1 - hex print, 2 - pipe to stdout)
  -X, --regression               Regression test
  -y, --regression-loop-cnt      Regression loop count
  -r, --regression-thread-cnt    Regression thread count
  -p, --wait-for-key             Wait for key pressed (until compaction done)
  -h, --help                     Print this help

About

NoSQL client-server with transactions, iterators, vector operations, asynchronous future promise APIs and many more

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages