Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to pruning #1095

Closed
KonradStaniec opened this issue May 19, 2022 · 4 comments
Closed

Improvements to pruning #1095

KonradStaniec opened this issue May 19, 2022 · 4 comments
Labels

Comments

@KonradStaniec
Copy link
Contributor

Each node in portal network should have configured maximal storage quota. When quota is is reached the node should clean up data which is furthest from local node id and adjust decrease its radius to slowdown ingestion of data.

Current Approach:
Current approach require node to read pre-configured number of elements from database, keeping them sorted in some data structure in order of distance from local node id.

  • Reasonably fast - reading 100k elements from 1M element db (1,4GB) and having them sorted takes around 1s, majority of cleanup time is spent in "VACUUM;" statement, which actually deletes elements form db and repackages it into minimal size
  • Can use different distance functions for sorting elements
  • easy to find content close to other nodes
  • Cannot precisely determine how much data will be deleted. To number of elements to keep sorted must be decided up front to not overwhelm node memory
  • Not easy to adapt to use as range query

Alternative1:

Keep additional column shortId which stores first 8 most significant bytes of each id as sqlite Integer. It makes possible to use sqlite built in bit wise operations to construct XOR distance - ((?1 | storage.short_content_id) - (?1 & storage.short_content_id))

  • Makes range queries possible i.e SELECT id WHERE xor(short_id, id) > xxxx AND xor(short_id, id) < xxxx where xor is ((?1 | storage.short_content_id) - (?1 & storage.short_content_id))
  • Can precisely delete as much data as necessary without reading whole result set in memory
  • easy to find content close to other nodes
  • over head of at least 8 bytes per key (probably more as additional column also have overhead)
  • ((?1 | storage.short_content_id) - (?1 & storage.short_content_id)) is not really readable for someone who will maintain it
  • less precision when sorting as we sort only taking 8 leading bytes into account

Alternative 2:

Extend sqlite api to support adding custom functions. Then add custom distance function - xorDistance, to compare 32 byte blobs.

  • Reasonably fast - performance is similar to current version without the need to read entire result set into memory
  • Makes range queries possible i.e SELECT id WHERE xorDistance(id, someId) > xxxx AND xorDistance(id, someId) < xxxx
  • Can precisely delete as much data as necessary without reading whole result set in memory
  • easy to find content close to other nodes
  • Need to use low level sqlite c api to, which is not well documented.

Alternative 3:

Use xor(local_node_id, content_id) as primary key in our table.

  • probably fastest option as table is kept in sorted order, so range queries for content furthest from our node should be super fast
  • every lookup for content_id will need to be xored with local node_id
  • not really easy to find content close to other node

Taking into account that:

  • pruning is not frequent operation
  • the most overhead on pruning is generated by VACUUM; statement

From all presented alternatives I prefer option 2, as it makes all of our queries possible and delegate most of responsibilities to sqlite. Performance wise it is not the fastest, but it should not be problem

@kdeme
Copy link
Contributor

kdeme commented May 20, 2022

Thanks for the write-up! I've got some questions:

Current approach require node to read pre-configured number of elements from database

With read you mean here the actually returning of the content, not to access it? As it does still need to iterate all the elements of the database, right?

  • Cannot precisely determine how much data will be deleted. To number of elements to keep sorted must be decided up front to not overwhelm node memory

I still don't fully understand why we can't delete items until a certain amount of delete data is reached. You could decide how many items to keep sorted based on a total size value to reach?

That being said, I do think that alternative 1 and 2 seem more useful, especially because of the more "natural" range queries.

Alternative1:

Keep additional column shortId which stores first 8 most significant bytes of each id as sqlite Integer.

Not fully sure but I think the original idea of using the shortId is to be able to use it as an Index, so that one does not have to access all. Is that what you are doing / planning to do here? It seems not from the query.

Do you have any numbers/benchmarks on the queries for the alternatives (not sure which ones you tried already)?
And about how much time is spend on the cleanup from the "VACUUM;" statement?

It would be nice to have a small benchmark tool with which we can keep testing these operations.

@KonradStaniec
Copy link
Contributor Author

With read you mean here the actually returning of the content, not to access it? As it does still need to iterate all the elements of the database, right?

Yep it still needs to iterate whole db. And it acutally return only tuple contentId, len(content), distanceFrom which is enough meta data to make decision about deletion

I still don't fully understand why we can't delete items until a certain amount of delete data is reached. You could decide how many items to keep sorted based on a total size value to reach?

Yep you could do that , but then in case that there is a lot of small elements in database, it would need to read a lot of elements into memory which could with some kinda out of memory error, so you still need some kinda fixed value to constrain maximum of elements which would end up in memory.

Not fully sure but I think the original idea of using the shortId is to be able to use it as an Index, so that one does not have to access all. Is that what you are doing / planning to do here? It seems not from the query.

Currently shortId is not used as index. It could be, as it would speed up range queries (although it would probably increase space overhead a bit)

As discussed alternatives 2 and 3 are trade-off between space and time.

And about how much time is spend on the cleanup from the "VACUUM;" statement?

After deleting 50MB from 1GB data base, "VACUUM;" take around 8-12s on my machine, which is pretty much tbh.

@kdeme
Copy link
Contributor

kdeme commented May 20, 2022

After deleting 50MB from 1GB data base, "VACUUM;" take around 8-12s on my machine, which is pretty much tbh.

I was curious as to why this takes so long, and not knowing how VACUUM works I had a look at https://www.sqlite.org/lang_vacuum.html

The VACUUM command works by copying the contents of the database into a temporary database file and then overwriting the original with the contents of the temporary file. When overwriting the original, a rollback journal or write-ahead log WAL file is used just as it would be for any other database transaction. This means that when VACUUMing a database, as much as twice the size of the original database file is required in free disk space.

This is an issue, we can't let users provide a max storage size and then double it when we need to prune. So I believe we will have to move away from the vacuum way.

Reading about how deletion and vacuum works, I think it is fine that the database sits at max-storage-size with free pages as it will get refilled to nearly that point anyhow. The shrinking is kind of a useless operation (Except when max-storage-size is altered, which is a different use case).
It does mean the current size calculation needs to be altered as the free pages needs to be deducted, which might be tricky.
I did see there is a PRAGMA schema.freelist_count; (https://sqlite.org/pragma.html). I guess there will be partially filled pages too, so perhaps it will not be very accurate. But perhaps it is sufficient. Something to look into?

Other path would be auto-vacuum as you have mentioned before. From that page I read that the side effects there are potentially more fragmentation and also that it doesn't compact the partially filled pages.

@KonradStaniec
Copy link
Contributor Author

Improved by:
#1113
#1103

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants