-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
READY: Experiment for DHT dissemination time measurements. #389
Conversation
Can one of the admins verify this patch? |
OK to test |
Ok to test |
The dissemination time here is measured by a The dashed vertical lines mean that the peer (to which they correspond, indicated by the horizontal axis) was unable to find an entry for the key, i.e. the entry was not disseminated to them during the time allotted to the experiment. The vertical axis presents the time it took for the entry to be disseminated to a peer. It is measured in milliseconds (the exact time is presented as a label). |
So it's either really fast or doesn't work at all? |
Sorry for the late response. My guess here is that the DHT is maybe not designed to distribute the entry to every node in the community, but rather ensure a sufficient dissemination, such that peers that don't have the entry can query nearby nodes for it. In fact, there might be a limit a hardcoded threshold in the codebase here. The limit here is Or perhaps there's a confusion of how the test case is implemented? I have a looping call (in each node) which calls the |
@DanGraur interesting experiment. Curious to see what the dissemination times are in a network of 2000 nodes running IPv8 on the DAS5 :) Instead of relying on a I'm not surprised by the low dissemination times since you run the experiment locally, without "real" network traffic and latency. While I do not know the Kademlia protocol by heart, values are stored on nodes closest to a specific key and not necessarily all nodes in the network. Which nodes are storing a specific value depends on the identities that each node in the network has. I would assume that you get different results when you run the experiment again (since the peers are then assigned different identities)? |
Some quick feedback.. quick insert time is only of moderate interest, but essential that it works. Can we test scalability? Key/value pairs should not be stored at all nodes, only for small tests. Fast lookup, scalability, and some resilience to churn are the DHT strong points. Do we have tests for that? |
@synctext IIRC, we don't have any unit test for churn (yet). Fault tolerance would be an interesting experiment but I would suggest focussing on scalability first (work towards an experiment with a few thousand nodes)? Plotting the CPU usage/bandwidth requirements are trivial to do in Gumby 👍 |
Agree fully, will be solid progress if we would gave a few thousand node experiment. |
@devos50 ok, I'll try to get on this as soon as I can. I also have another idea for a test case: the hop count for a DHT lookup. I think this could be interesting. Let me know if you think this is a good idea as well. I'm a bit busy these days, but will try to implement this as well as soon as I can (also, knowing that I can also write some custom code, which wouldn't normally be accepted in the master branch, makes this much easier - this was the reason why I didn't write a callback in the first place, which is what I wanted to do initially -). |
Nice, yes, lookup experiment is great idea with hopcount as key performance indicator together with latency. |
@devos50 once again sorry for the late response. Indeed, when running the same experiment multiple times I do get different results. For instance, this is the result I get when running the same experiment again. |
I've implemented the callback functionality here in the DHTModule, and the call itself in one of my fork's branches. I've also executed the experiment multiple times, and the time measurements generally seem to be lower than the Below, I've attached below a few diagrams to show the new results. |
@qstokkink @DanGraur this job runs a basic DHT validation experiment on the DAS5: https://jenkins-ci.tribler.org/job/validation_experiments/job/validation_experiment_dht/. I suggest looking at this one. |
…a data element to be distributed across the peers in a DHT community. A new local test configuration written specifically for this test case, as well as a new scenario. Also added a new R script which parses the data, and genrates a scatterplot (of the per node dissemination times) for it. The DHTModule class was also modified in order to be able to parse the generated per node logs, and to accommodate annotations.
5fdfd3e
to
976baaa
Compare
…also changed the scenario of the local version such that it now uses a for loop.
473c148
to
60bc59d
Compare
I've finally created a Jenkins project to run this experiment at large scale (100 nodes). Here's the link to it: https://jenkins-ci.tribler.org/job/dissemination_experiment_dht/. I think this PR is ready now. |
@DanGraur what is the status of this PR? |
I'm not sure if this will be merged on short term. Will re-open if we work on additional DHT experiments 👍 |
Added a new test scenario which should capture the time it takes for a data element to be distributed across the peers in a DHT community. A new local test configuration written specifically for this test case, as well as a new scenario. Also added a new R script which parses the data, and generates a scatterplot (of the per node dissemination times) for it. The DHTModule class was also modified in order to be able to parse the generated per node logs, and to accommodate annotations.