READY: Latest TrustChain block via DHT #349

DanGraur · 2018-10-28T14:55:50Z

Fixes #369

Added a new endpoint for retrieving the latest TrustChain block via the DHT mechanism. Added a new callback in the TrustChainCommunity class which is called whenever a block is being added in the DB, irrelevant of the block's type. It is now possible to retrieve the latest TrustChain block of some peer via HTTP requests. Created a new test set for the newly added block retrieval mechanism. In addition to this, made a few changes to the REST API test suite in terms of the communication modules, such as moving classes from one module to another and creating new modules which host logic for HTTP requests and communication for specific endpoints.

DanGraur · 2018-10-28T15:19:17Z

I have a couple of questions regarding an aspect of the implementation. In order to have the latest TC block published to the DHT, as soon as it gets added to the database, I added a callback named new_block_cb, which does just this. It is called in two places: here and here. I did notice that there is a notify_listeners method, which calls the listeners attached to a particular type of block. My issue with this is that it gets called on a particular type of block, and not on all types of blocks, which is why I created this new method. I have two questions:

Should I remove it, and make use of the notify_listeners method only?
If not, then should I maybe move it inside the notify_listeners method, or should I leave it as is (i.e. explicit calls)? I initially decided to not embed it in notify_listeners since I believed that new_block_cb implements a behavior which is very specific to this use case, and might not be as general as notify_listeners.

devos50 · 2018-10-28T15:24:55Z

@DanGraur You are right that notify_listeners is only triggered for specific block types. I was thinking about adding a wildcard option to notify_listeners so it is triggered by every block.

I would try to avoid adding another type of callback if possible and use notify_listeners instead. However, I'm not sure what the best way would be to achieve this. Maybe we could make it a separate method in the BlockListener abstract class (like received_block)? Then name it something like added_database_block?

qstokkink · 2018-10-29T08:38:18Z

I would also vote for making use of notify_listeners.

DanGraur · 2018-10-29T10:41:50Z

Ok, then I'll make the required changes to make use of the notify_listeners.

DanGraur · 2018-10-30T21:45:00Z

Ok, I removed the explicit callback and instead added a wildcard to the listeners_map as suggested by @devos50. The wildcard is stored in TrustChainCommunity's UNIVERSAL_BLOCK_LISTENER field. It should be noted that in the current implementation the peer will publish the latest block in the TrustChainDB (via the get_latest method in the aforementioned DB class) having its own public key as the block public key. That is, the parameter passed to the callback in the listeners_map is not used, and is not actually the one published. I was wondering if the block passed to the callback as parameter should be the one published to the DHT?

DanGraur · 2018-10-30T23:15:31Z

In this latest commit, I enhanced the latest TC block via DHT publication mechanism, such that a block which is already in the DHT will not be published again under a different (embedded) version. This will increase the cost of adding a block to the DHT (due to the extra overhead of checking for a duplicate block), but will decrease the load on the DHT.

test_classes_list.txt

DanGraur · 2018-10-31T22:03:59Z

I've added the blank line at the end of the test_classes_list.txt.

ipv8/REST/dht_endpoint.py

qstokkink

Looking good, make sure to still rebase and fix my two comments.
Also, make sure python create_test_coverage_report.py still works.

DanGraur · 2018-11-02T23:57:46Z

Added the extra blank line at the end of test_classes_list.txt. Swapped "I" for "H" in DHTBlockEndpoint's reconstruct_all_blocks method when unpacking part of the block. Rebased the branch, and made sure that python create_test_coverage_report.py still works (it does both for Python 2 and Python 3). Also squashed the commits.

qstokkink

Looks good 👍

xoriole · 2018-11-05T10:41:57Z

Should there be an additional check that a peer can only publish its latest TC block only under its own public key and not on somebody else key? Similarly, validation of the received blocks under the DHT key.

Other than that, seems good.

DanGraur · 2018-11-09T10:39:16Z

Sorry for the late response. @xoriole I think adding the validation should be easy enough, but for ensuring that a peer will only published by a node under its own PK, I'm not particularly sure what strategy to employ. Also, it should be mentioned that currently, the TrustChain block is published under the peer's mid, rather than PK. I initially implemented the publishing operation using the PK, but decided to change it to mid. Should I change this back? Also, should this check be implemented directly in the DHT mechanism (more specifically, in the on_store_request method in the DHTCommunity class), such that other peers receiving the blocks will ignore the requests if the mid does not match with the source address' mid?

egbertbouman · 2018-11-09T15:00:47Z

@DanGraur Some comments regarding the DHT. The DHT uses 20-byte keys, so I'd definitely stick to using mid's.

Currently, any peer can store information under whatever key they want. I'm not sure about changing this since other features rely on the DHT as well.

The DHT does offer the option to sign values before storing them in the DHT. When this feature is used, and the signed value is stored under the mid of the signing peer, it will get a higher priority when returning to value (i.e., it will always be at the top of the results list). However, using this feature only one signed value can be stored under a particular key at any given time. Since you're splitting the block over multiple DHT entries, this feature can't be used. Is there a reason to prefer storing the entire block instead of, for instance, just the SHA-1?

DanGraur · 2018-11-09T15:23:20Z

I think it is still possible to make use of the PK, since the code I've written produces an SHA-1 hash of the raw key (which was previously the mid concatenated with some special prefix, so that it doesn't collide with some other key using perhaps just the mid, either now or in the future). Since the digest size is 160 bits I think it should be fine even if the public key is used as well.

I've implemented the code to publish the entire block (in chunks), since that was my initial understanding of the task, i.e. the requesting peer wants to get the block itself and reconstruct it via the DHT. I think a final goal of this task (not of this particular PR, though), is to make it possible to get the entire TrustChain via the DHT, although this, in my opinion, will impose quite a lot of strain on the DHT, due to the possibly large size of the TrustChain. I think, technically, my implementation can to a certain extent do that (or at least get the last few published blocks, which haven't been discarded by the DHT due to staleness). That being said some more code needs to be written to get the last few blocks, and not just the last one.

As far as the mechanism for signing multiple published blocks goes, I've started writing (and I think I'm nearly done) some code which signs the block's chunks using the publishing node's private key. The retrieving node, will then use the public key of the publishing node, to ensure the chunks are faithful. This is why I'm starting to think that the public key might also be a good idea for both publishing and getting the block from the DHT, since it's use is two fold: for computing the key, and validating the block's chunks. Also, this validation mechanism also solves the issue of ensuring the block is indeed published by the true node, since the true node will be the only one with that particular SK. What might be an issue is that this won't stop a malicious node from still publishing fake TrustChain blocks under some other peer's public key. These blocks won't be valid, but they will incur some (spatial) strain on the DHT. This is however a general problem, I think, since it is possible to currently publish data under any key, without many restrictions. Perhaps a future task would be to allow a node to publish data to the DHT under a specific key, which will then be 'restricted' just to it, i.e. other nodes can get data under that key, but not publish under it. This solution could be general though, and not just for this use case.

egbertbouman · 2018-11-09T17:53:56Z

I didn't realize by PK you meant the hash of the public key. You're right, you can use any 20 byte key you'd like.

Publishing the latest block on the DHT is likely to already introduce a significant train on the DHT, since the latest block will change frequently. Having said that, the mere idea of having the complete chain on the DHT terrifies me.

I agree, signing the block-pieces yourself is the best way to do this at the moment. Your idea of restricting write access to certain keys has a similar effect to that of the signed-value feature already in place (i.e., you restrict write access to the first result).

qstokkink · 2018-11-12T14:11:40Z

@egbertbouman as you are the author of the DHT community, does this PR need any changes before merging?

DanGraur · 2018-11-12T14:51:36Z

I still have to push a new commit which adds validation to the chunks of the block being published to the DHT. Also, I wanted to mention this earlier, but I added a method in the test for this PR which expands the last_queries dequeue to around 20 requests. The reason why I did this, is that I noticed that when publishing two blocks, a chunk of the second block was being dropped due to an excessive amount of requests in the recent past from the publishing peer. To counteract this, I increased the dequeue (only in that given test) so that requests won't be dropped due to the permissive maxlen of the dequeue.

I still have to fix another test, since the changes I made by adding verification, and using the PK instead of the mid meant that some of the tests would break. I'll push this commit later tonight hopefully.

egbertbouman · 2018-11-13T11:28:04Z

Everything looks good! One thing that maybe we could add to the DHT in the future is the ability to store multiple values at once. This would avoid superfluous find/store-requests, and probably also stop the need to change the maxlen of last_queries while testing.

DanGraur · 2018-11-14T23:22:16Z

Sorry for the long delay. I pushed a new commit which does the following:

Adds signatures to block chunks published to the DHT.
Checks the signatures of the read chunks when reconstructing a block. If a block chunk fails to be verified, the PK's verify() method will throw an exception. There currently is no logic which catches and handles that exception. Should I add this?
The PK will be used for publishing a block to the DHT, as opposed to the mid. The PK is suffixed with b'_BLOCK', and then hashed to obtain the key. The PK is also used for verification.
Block chunks are now published as a new payload type called DHTBlockPayload (here is its implementation). I feel this is better practice, since future changes can mostly be made in one place. However, this has a downside: constructing such a payload and then serializing it, as opposed to the simple concatenation operations used before, incurs a greater overhead. In one of the unit tests (here), I even had to add a self.sleep() call after a block was published to the DHT, since not all chunks would be properly published and disseminated before the request for the latest block came in, and thus would sometimes construct a partial block (missing a few bytes at the end, i.e. its last chunk).

qstokkink · 2019-01-20T13:15:54Z

@devos50 could you review this? (Specifically the TrustChain changes)

qstokkink

Be sure to fix the merge conflict still.

egbertbouman · 2019-01-28T13:55:30Z

@DanGraur I just noticed that you're not doing DHT requests, but you're just looking at the local storage:

py-ipv8/ipv8/REST/dht_endpoint.py

Line 154 in c52cbea

block_chunks = self.dht.storage.get(hash_key)

Assuming that this is unintentional, you should be using find_values. See SpecificDHTValueEndpoint for an example of how to use that function.

…he DHT mechanism. Added a wildcard listener in the listener_map (of the TrustChainCommunity class) whose associated listeners are called regardless of block type. This wildcard is used to publish newly added to the TrustChainDB blocks to the DHT. The callback for this mechanism also defines behavior which disallows the same block from being published twice to the DHT. It is now possible to retrieve the latest TrustChain block of some peer via a HTTP request. Created a new test set for the newly added block retrieval mechanism. In addition to this, made a few changes to the REST API test suite in terms of the communication modules, such as moving classes from one module to another and creating new modules which host logic for HTTP requests and communication for specific endpoints.

…he node's secret key. In addition to this, this signature will be used on the receiving end for verifying the faithfulness of the chunks and, implicitly, of the block. Chunks will also be published under a new type of payload class, called DHTBlockPayload. The public key will be used (together with a special suffix) for publishing a block to the DHT (it should be mentioned that the raw key will be hashed before publishing).

DanGraur · 2019-02-10T18:26:55Z

@egbertbouman thank you very much for noticing this! It was indeed a mistake. I fixed it now, and replaced it with find_values calls, which are then handled asynchronously. I've also dealt with the merge conflicts. I'll clean up the commit history after the changes in this last commit are approved.

…BlockPayload. Removed the dht.storage.get and replaced them with dht.find_values calls, which are handled asynchronously. Finally, made a few changes and additions to the test_dht_endpoint module.

qstokkink · 2019-02-12T09:38:28Z

Do you have updated Gumby graphs for this new find_value fix?

DanGraur · 2019-02-12T10:05:51Z

As far as the Gumby experiments which I wrote and forwarded as PRs here and here, both of these were using the find_value call where relevant already, so the graphs shouldn't have changed due to this fix. If you're referring to a dedicated test case for this PR, then I cannot say how the graphs should look yet, since I haven't written a test for this. I can try to write one if you'd like.

qstokkink · 2019-02-12T10:33:34Z

@DanGraur that's fine then, if only this PR had the find_values bug.

DanGraur · 2019-02-12T10:42:21Z

Ok, I'll leave it as it is then.

qstokkink

@devos50 since this affects the TrustChain, I'll wait for your formal approval before merging this.

devos50 · 2019-02-13T08:39:00Z

retest this please

DanGraur changed the title ~~Latest TC block via DHT~~ READY: Latest TC block via DHT Oct 28, 2018

DanGraur force-pushed the tc_via_dht branch 2 times, most recently from 2e60cd2 to 917e854 Compare October 28, 2018 15:06

DanGraur changed the title ~~READY: Latest TC block via DHT~~ READY: Latest TrustChain block via DHT Oct 28, 2018

DanGraur force-pushed the tc_via_dht branch from 2abcd6f to fa0ed5d Compare October 30, 2018 21:17

qstokkink reviewed Oct 31, 2018

View reviewed changes

test_classes_list.txt Outdated Show resolved Hide resolved

qstokkink reviewed Nov 2, 2018

View reviewed changes

ipv8/REST/dht_endpoint.py Outdated Show resolved Hide resolved

qstokkink requested changes Nov 2, 2018

View reviewed changes

qstokkink changed the title ~~READY: Latest TrustChain block via DHT~~ WIP: Latest TrustChain block via DHT Nov 2, 2018

DanGraur force-pushed the tc_via_dht branch 2 times, most recently from f7d26d2 to 2c3684b Compare November 2, 2018 23:51

qstokkink changed the title ~~WIP: Latest TrustChain block via DHT~~ READY: Latest TrustChain block via DHT Nov 3, 2018

qstokkink previously approved these changes Nov 3, 2018

View reviewed changes

qstokkink requested review from devos50 and xoriole November 3, 2018 07:31

DanGraur dismissed qstokkink’s stale review via c52cbea November 14, 2018 23:02

DanGraur mentioned this pull request Nov 15, 2018

Signed latest block announcement against block withholding attacks through DHT #369

Closed

DanGraur changed the title ~~WIP: Latest TrustChain block via DHT~~ READY: Latest TrustChain block via DHT Nov 16, 2018

devos50 mentioned this pull request Nov 27, 2018

decentralised non-profit payment services Tribler/tribler#4044

Closed

devos50 previously approved these changes Jan 28, 2019

View reviewed changes

qstokkink previously approved these changes Jan 28, 2019

View reviewed changes

DanGraur changed the title ~~READY: Latest TrustChain block via DHT~~ WIP: Latest TrustChain block via DHT Feb 8, 2019

DanGraur added 2 commits February 10, 2019 18:54

DanGraur dismissed stale reviews from qstokkink and devos50 via 2b998c6 February 10, 2019 18:10

DanGraur force-pushed the tc_via_dht branch 2 times, most recently from 2b998c6 to 9ca7075 Compare February 10, 2019 18:22

DanGraur changed the title ~~WIP: Latest TrustChain block via DHT~~ READY: Latest TrustChain block via DHT Feb 11, 2019

Embedded the chunk position and the total number of chunks in the DHT…

49f31b9

…BlockPayload. Removed the dht.storage.get and replaced them with dht.find_values calls, which are handled asynchronously. Finally, made a few changes and additions to the test_dht_endpoint module.

DanGraur force-pushed the tc_via_dht branch from 9ca7075 to 49f31b9 Compare February 11, 2019 10:32

qstokkink approved these changes Feb 13, 2019

View reviewed changes

devos50 approved these changes Feb 13, 2019

View reviewed changes

qstokkink merged commit 71de9e9 into Tribler:master Feb 13, 2019

egbertbouman mentioned this pull request Feb 16, 2019

Error in devel with latest IPv8 Tribler/tribler#4227

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

READY: Latest TrustChain block via DHT #349

READY: Latest TrustChain block via DHT #349

DanGraur commented Oct 28, 2018 •

edited by qstokkink

DanGraur commented Oct 28, 2018 •

edited

devos50 commented Oct 28, 2018

qstokkink commented Oct 29, 2018

DanGraur commented Oct 29, 2018

DanGraur commented Oct 30, 2018

DanGraur commented Oct 30, 2018

DanGraur commented Oct 31, 2018

qstokkink left a comment

DanGraur commented Nov 2, 2018 •

edited

qstokkink left a comment

xoriole commented Nov 5, 2018

DanGraur commented Nov 9, 2018

egbertbouman commented Nov 9, 2018 •

edited

DanGraur commented Nov 9, 2018 •

edited

egbertbouman commented Nov 9, 2018

qstokkink commented Nov 12, 2018

DanGraur commented Nov 12, 2018

egbertbouman commented Nov 13, 2018

DanGraur commented Nov 14, 2018 •

edited

qstokkink commented Jan 20, 2019

qstokkink left a comment

egbertbouman commented Jan 28, 2019

DanGraur commented Feb 10, 2019 •

edited

qstokkink commented Feb 12, 2019

DanGraur commented Feb 12, 2019

qstokkink commented Feb 12, 2019

DanGraur commented Feb 12, 2019

qstokkink left a comment

devos50 commented Feb 13, 2019

READY: Latest TrustChain block via DHT #349

READY: Latest TrustChain block via DHT #349

Conversation

DanGraur commented Oct 28, 2018 • edited by qstokkink

DanGraur commented Oct 28, 2018 • edited

devos50 commented Oct 28, 2018

qstokkink commented Oct 29, 2018

DanGraur commented Oct 29, 2018

DanGraur commented Oct 30, 2018

DanGraur commented Oct 30, 2018

DanGraur commented Oct 31, 2018

qstokkink left a comment

Choose a reason for hiding this comment

DanGraur commented Nov 2, 2018 • edited

qstokkink left a comment

Choose a reason for hiding this comment

xoriole commented Nov 5, 2018

DanGraur commented Nov 9, 2018

egbertbouman commented Nov 9, 2018 • edited

DanGraur commented Nov 9, 2018 • edited

egbertbouman commented Nov 9, 2018

qstokkink commented Nov 12, 2018

DanGraur commented Nov 12, 2018

egbertbouman commented Nov 13, 2018

DanGraur commented Nov 14, 2018 • edited

qstokkink commented Jan 20, 2019

qstokkink left a comment

Choose a reason for hiding this comment

egbertbouman commented Jan 28, 2019

DanGraur commented Feb 10, 2019 • edited

qstokkink commented Feb 12, 2019

DanGraur commented Feb 12, 2019

qstokkink commented Feb 12, 2019

DanGraur commented Feb 12, 2019

qstokkink left a comment

Choose a reason for hiding this comment

devos50 commented Feb 13, 2019

DanGraur commented Oct 28, 2018 •

edited by qstokkink

DanGraur commented Oct 28, 2018 •

edited

DanGraur commented Nov 2, 2018 •

edited

egbertbouman commented Nov 9, 2018 •

edited

DanGraur commented Nov 9, 2018 •

edited

DanGraur commented Nov 14, 2018 •

edited

DanGraur commented Feb 10, 2019 •

edited