READY: Experiment for DHT dissemination time measurements. #389

DanGraur · 2018-12-16T20:16:08Z

Added a new test scenario which should capture the time it takes for a data element to be distributed across the peers in a DHT community. A new local test configuration written specifically for this test case, as well as a new scenario. Also added a new R script which parses the data, and generates a scatterplot (of the per node dissemination times) for it. The DHTModule class was also modified in order to be able to parse the generated per node logs, and to accommodate annotations.

tribler-ci · 2018-12-16T20:16:10Z

Can one of the admins verify this patch?

synctext · 2018-12-16T21:21:23Z

OK to test

qstokkink · 2018-12-17T07:47:14Z

Ok to test

DanGraur · 2018-12-17T15:40:56Z

The dissemination time here is measured by a LoopingCall, which samples the local DHT storage at every 1e-4 seconds. Once an entry for the given key is found, the time is stored to local storage, and the LoopingCall is cancelled. If no entry is found, the method will continue to be called, until the experiment time runs out, and the node will have logged that it wasn't able to find an entry for the given key. A sample of the generated graph can be seen here (tested locally with 1 machine and 20 processes):

The dashed vertical lines mean that the peer (to which they correspond, indicated by the horizontal axis) was unable to find an entry for the key, i.e. the entry was not disseminated to them during the time allotted to the experiment. The vertical axis presents the time it took for the entry to be disseminated to a peer. It is measured in milliseconds (the exact time is presented as a label).

qstokkink · 2018-12-19T09:09:40Z

So it's either really fast or doesn't work at all?

DanGraur · 2018-12-20T17:37:53Z

Sorry for the late response. My guess here is that the DHT is maybe not designed to distribute the entry to every node in the community, but rather ensure a sufficient dissemination, such that peers that don't have the entry can query nearby nodes for it. In fact, there might be a limit a hardcoded threshold in the codebase here. The limit here is 8, and as we can see in the graph above, there are indeed 8 hits and 11 misses.

Or perhaps there's a confusion of how the test case is implemented? I have a looping call (in each node) which calls the get method in the Storage class (i.e. we look for the entry locally). When there's finally a non-empty result returned, it means this node has been given the entry by the source node. If during the experiment duration the node cannot find the entry in its local Storage, it means it wasn't sent the entry by the source node (of the entry), but that does not mean it cannot call the find method in DHTCommunity. It can, and it will find the entry (given normal operating conditions).

devos50 · 2018-12-28T14:09:51Z

@DanGraur interesting experiment. Curious to see what the dissemination times are in a network of 2000 nodes running IPv8 on the DAS5 :)

Instead of relying on a LoopingCall that continuously runs and polls the key/value store, I would implement a callback when something is inserted in the store (you can use your own Tribler/IPv8 branch with custom code in a Gumby experiment). This is straightforward if you are running an experiment on your local computer.

I'm not surprised by the low dissemination times since you run the experiment locally, without "real" network traffic and latency. While I do not know the Kademlia protocol by heart, values are stored on nodes closest to a specific key and not necessarily all nodes in the network. Which nodes are storing a specific value depends on the identities that each node in the network has. I would assume that you get different results when you run the experiment again (since the peers are then assigned different identities)?

synctext · 2018-12-28T14:34:23Z

Some quick feedback.. quick insert time is only of moderate interest, but essential that it works.

Can we test scalability? Key/value pairs should not be stored at all nodes, only for small tests. Fast lookup, scalability, and some resilience to churn are the DHT strong points. Do we have tests for that?
Tribler network will soon hit 20k concurrent nodes. Discovery of hidden swarms depends on DHT.

devos50 · 2018-12-28T14:59:10Z

@synctext IIRC, we don't have any unit test for churn (yet). Fault tolerance would be an interesting experiment but I would suggest focussing on scalability first (work towards an experiment with a few thousand nodes)? Plotting the CPU usage/bandwidth requirements are trivial to do in Gumby 👍

synctext · 2018-12-29T09:00:09Z

Agree fully, will be solid progress if we would gave a few thousand node experiment.

DanGraur · 2018-12-29T20:15:15Z

@devos50 ok, I'll try to get on this as soon as I can. I also have another idea for a test case: the hop count for a DHT lookup. I think this could be interesting. Let me know if you think this is a good idea as well. I'm a bit busy these days, but will try to implement this as well as soon as I can (also, knowing that I can also write some custom code, which wouldn't normally be accepted in the master branch, makes this much easier - this was the reason why I didn't write a callback in the first place, which is what I wanted to do initially -).

synctext · 2018-12-30T09:02:41Z

Nice, yes, lookup experiment is great idea with hopcount as key performance indicator together with latency.

DanGraur · 2019-01-02T20:55:10Z

@devos50 once again sorry for the late response. Indeed, when running the same experiment multiple times I do get different results. For instance, this is the result I get when running the same experiment again.

DanGraur · 2019-01-05T22:46:07Z

I've implemented the callback functionality here in the DHTModule, and the call itself in one of my fork's branches.

I've also executed the experiment multiple times, and the time measurements generally seem to be lower than the LoopingCall version (there was probably some noticeable overhead due to the LoopingCalls). This change should also make the measurements more precise (previously they were less precise since the measurement methods were called every 1e-4 seconds, and were more computationally intensive).

Below, I've attached below a few diagrams to show the new results.

qstokkink · 2019-01-08T07:44:10Z

We want to see how this does on the DAS5 as well. @devos50 could you give @DanGraur a job to play around with?

devos50 · 2019-01-08T10:01:36Z

@qstokkink @DanGraur this job runs a basic DHT validation experiment on the DAS5: https://jenkins-ci.tribler.org/job/validation_experiments/job/validation_experiment_dht/. I suggest looking at this one.

…a data element to be distributed across the peers in a DHT community. A new local test configuration written specifically for this test case, as well as a new scenario. Also added a new R script which parses the data, and genrates a scatterplot (of the per node dissemination times) for it. The DHTModule class was also modified in order to be able to parse the generated per node logs, and to accommodate annotations.

…also changed the scenario of the local version such that it now uses a for loop.

DanGraur · 2019-04-08T14:14:23Z

I've finally created a Jenkins project to run this experiment at large scale (100 nodes). Here's the link to it: https://jenkins-ci.tribler.org/job/dissemination_experiment_dht/. I think this PR is ready now.

devos50 · 2019-09-20T07:58:08Z

@DanGraur what is the status of this PR?

devos50 · 2020-07-15T08:45:56Z

I'm not sure if this will be merged on short term. Will re-open if we work on additional DHT experiments 👍

devos50 changed the title ~~Experiment for DHT dissemination time measurements.~~ WIP: Experiment for DHT dissemination time measurements. Dec 28, 2018

DanGraur mentioned this pull request Jan 13, 2019

DHT dissemination time measurement and plot #394

Closed

DanGraur mentioned this pull request Feb 12, 2019

READY: Latest TrustChain block via DHT Tribler/py-ipv8#349

Merged

DanGraur added 2 commits April 7, 2019 00:00

Added a variable to the propagation scenario.

976baaa

DanGraur force-pushed the dht_dissemination_experiment branch from 5fdfd3e to 976baaa Compare April 6, 2019 22:35

Added a large scale DAS4 experiment for the dissemination times, and …

60bc59d

…also changed the scenario of the local version such that it now uses a for loop.

DanGraur force-pushed the dht_dissemination_experiment branch from 473c148 to 60bc59d Compare April 8, 2019 10:33

DanGraur changed the title ~~WIP: Experiment for DHT dissemination time measurements.~~ READY: Experiment for DHT dissemination time measurements. Apr 8, 2019

devos50 closed this Jul 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

READY: Experiment for DHT dissemination time measurements. #389

READY: Experiment for DHT dissemination time measurements. #389

DanGraur commented Dec 16, 2018 •

edited

Loading

tribler-ci commented Dec 16, 2018

synctext commented Dec 16, 2018

qstokkink commented Dec 17, 2018

DanGraur commented Dec 17, 2018 •

edited

Loading

qstokkink commented Dec 19, 2018

DanGraur commented Dec 20, 2018 •

edited

Loading

devos50 commented Dec 28, 2018 •

edited

Loading

synctext commented Dec 28, 2018

devos50 commented Dec 28, 2018

synctext commented Dec 29, 2018

DanGraur commented Dec 29, 2018 •

edited

Loading

synctext commented Dec 30, 2018

DanGraur commented Jan 2, 2019

DanGraur commented Jan 5, 2019 •

edited

Loading

qstokkink commented Jan 8, 2019

devos50 commented Jan 8, 2019

DanGraur commented Apr 8, 2019 •

edited

Loading

devos50 commented Sep 20, 2019

devos50 commented Jul 15, 2020

READY: Experiment for DHT dissemination time measurements. #389

READY: Experiment for DHT dissemination time measurements. #389

Conversation

DanGraur commented Dec 16, 2018 • edited Loading

tribler-ci commented Dec 16, 2018

synctext commented Dec 16, 2018

qstokkink commented Dec 17, 2018

DanGraur commented Dec 17, 2018 • edited Loading

qstokkink commented Dec 19, 2018

DanGraur commented Dec 20, 2018 • edited Loading

devos50 commented Dec 28, 2018 • edited Loading

synctext commented Dec 28, 2018

devos50 commented Dec 28, 2018

synctext commented Dec 29, 2018

DanGraur commented Dec 29, 2018 • edited Loading

synctext commented Dec 30, 2018

DanGraur commented Jan 2, 2019

DanGraur commented Jan 5, 2019 • edited Loading

qstokkink commented Jan 8, 2019

devos50 commented Jan 8, 2019

DanGraur commented Apr 8, 2019 • edited Loading

devos50 commented Sep 20, 2019

devos50 commented Jul 15, 2020

DanGraur commented Dec 16, 2018 •

edited

Loading

DanGraur commented Dec 17, 2018 •

edited

Loading

DanGraur commented Dec 20, 2018 •

edited

Loading

devos50 commented Dec 28, 2018 •

edited

Loading

DanGraur commented Dec 29, 2018 •

edited

Loading

DanGraur commented Jan 5, 2019 •

edited

Loading

DanGraur commented Apr 8, 2019 •

edited

Loading