New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
msc placeholder: 5G overlay infrastructure for decentralised learning ??? #7258
Comments
@synctext registration is for the thesis not for the literature survey no ? |
indeed, it's nice if you register your thesis as early as possible. |
Feel free to add a bit more content on reproducing state-of-the-art literature. scientific problem of universal connectivity is not explained clearly. Storyline goes too fast, page 2 already has "port-restricted cone NAT". Take .5 page for a tutorial on the concept of an incoming connection. Need structure! Section 5. Reproducing results from literature EDIT: brainstorm about master thesis focus. Idea for title: "5G overlay infrastructure for edge-based decentralised learning". Context to sell your perfect_overlay effort. Only need a few weeks doing a minimal-viable-product of decentralised machine learning. Simply take this gossip-based ML algorithm and running code. Goal: 100 actual nodes {mixed real ARM Android and x86 Kotlin}! |
|
MARE: "5G overlay infrastructure for decentralised learning"Update:
|
Goal: mechanism for one phone to help another phone to puncture their carrier-grade NAT.
|
Nearly done with the Lit Survey. [38] citations to forums and scientific papers. Great result to include: Research Assistant job: send 50 UDP packets, count how many arrive. Repeat for all SIM-card combinations. Test the performance of EVA, note that you can then quickly run out of your 100-ish MByte SIM data quota. Read from Rahim on the binary transport protocol called EVA. See some example code: https://github.com/KoningR/eurotoken/blob/5c84348ba16dd9ce4b97e53ff52a5cefe9ee97c1/src/main/kotlin/evatest/EvaApplication.kt |
Lyca is symmetric NAT, the rest (Lebara, TMOBILE and vodaphone) could cross communicate while they all failed with Lyca ( even Lyca to Lyca communication failed). Theoretically with Birthday paradox Lyca to Lyca communication may be achieved. We need to determine the address and port predictability in order to understand how long it would take for the NAT to be penetrated and how long it would take for Lyca to block the requests Willingess to travel (and I have accommodation maybe?)
Reason for traveling: Live physical testing 4g and G5 communications and procurement of SIM cards Research assistantship ending 30/09/23! |
|
Final Literature Survey with the suggested improvements |
Comments on this latest survey:
|
Literature_Survey (1).pdf |
@synctext birthday attack between phone running on Vodaphone5g and emulator running in eduroam wifi worked and they managed to connect, still needs optimizations cause its heavy etc but at least we know it works! More details in my Slack message whats left to do:
|
Solid progress! Survey completed, now ready for Arxiv submission. Improve activity grid principle of status of each of the 25 connected IPv8 peers. Related IPFS work: https://github.com/plprobelab/network-measurements/blob/master/results/rfm15-nat-hole-punching.md |
THESIS TITLE (draft): First 5G deployment of Distributed Artificial Intelligence Measure: UDP bandwidth, bottlenecks, timeouts on Android client and NATs, connection reset time and port association time, all possible conditions that make successful communication possible and complete understanding of all possible factors that cause a communication failure. Determine if there is an upper bound to the number of concurrent IPs that a device can talk to(e.g. 63 works and adding a 64th may break the least recently used). Reliable data transfer: compare UDP and EVA protocol in terms of effective throughput, packet loss, congestion Measure the exact NAT behaviour! Measure NAT hole opening time! I have operational 10 or 12 sim cards. I have two phones, hence I can use 2 sim cards at the time |
update "This is brute forcing the public IP"{+port}, nice and sharp description somebody from Canada gave your work. |
SURVEY to be announced by Arxiv tomorrow
TODO:
Goal by Christmas:
|
Lit Survey is published: https://arxiv.org/abs/2311.04658 Edited to fix the broken reference link |
I HAVE CODE FOR: Measuring:
|
|
The Github repo of the research View data gathering progress in this google sheet |
The first result are that the success rate of birthday attack is low and very dependent on the provided as can be seen [here]https://docs.google.com/spreadsheets/d/1hmGZ38y3Cngt8hsbJbR7SoZpRnAUu7uKivV9ODkhKSs/edit?usp=sharingl). I propose to gather data on the mapping of the NAT. A server listens to incoming packets from a phone and logs the return address:port, while the phone does the same (logging the address:port that it sent the packet from). The results can then be compared and we can reverse engineer the mapping function of each NAT. This can be used to reduce the collision space (now 65535^2). According to RFC 4787 the NAT mapping protocol has different behaviour on different ranges, hence identifying the "convenient" ranges for each carrier will allow us to reduce the collision space and increase the connectivity rate! |
Idea of a "biased birthday attack" if you know the port-range, behaviour of used 4G/5G provider, or even the mapping function itself (trivial |
Update On data gathering: Android app that will spam the server is ready. The server was very hard to do because of the 65k simultaneous processes and I managed to run 30k ports yesterday successfully going beyond 30k throws an exception since it runs out of memory, on a machine with 16GB RAM so I emailed Sandip yesterday if he could give me a 64GB server, still waiting for reply My idea is to change the Birthday Attack based on the data gathered hoping to improve it. Then that repo will become a generic birthday attack public library for android connectivity that will be published in Gradle. There's no plan to use IPv8 as dependency
|
@Apple1D Indeed, strong authentication and identity management stuff is ignored by computer science for too long. Also no industry support, as it's not a golden money maker. Governments have decades of failures and many losses trying to craft this societal infrastructure. See our scientific analysis: https://arxiv.org/pdf/2401.05239.pdf |
Please keep track of your planning. In Feb 2023 we need to do your master thesis progress moment
|
Updates on Lebara Research:A single run looks like this: From any port goes to some specific "buckets" or ranges of ports, as shown The problem is that these buckets are not consistent across runs, and they change based on timeouts of the port, the number of requests, which again is not consistent (after analysis) What we know for sure:
Breakdown follows: Ranges of ports that were never mapped[(0, 1023), (19200, 19711), (40959, 41471), (49408, 49919), (60918, 65535)] Frequently Observed RangesThe ranges and their percentage frequency: [35328, 35583] -> 0.4762 I chose a 35% probability for all other ports and created a function.
This function takes 8.606910705566406e-05 seconds to run, so it is fast enough not to interfere with the app's speed (also, when ported to Kotlin, it will be faster). I will continue to analyze and try to find more relations for now. Unfortunately, the "seed port" choice seems pretty random. If anyone knows some data scientist/mathematician that can help, that would be great because this is getting out of my area of knowledge I want also @synctext insights on any Machine learning /statistical approaches because atm it is outside of my realm and I only do random weighted choice on a range Fallback Mechanism ProposalSince now its established that NAT picks a random "seed port" and then increments linearly I want to test if the linear incrementation is affected by the IP of the receiver i.e. if each new receiver forces a new random seed port. If not we can utilize a STUN-like server to log the initial seed port and then the other peer will have a starting point. |
Solid progress! ADDED: 89 characters of base-11?! Mobile networking in rural Ethiopia! by Ben Kuhn. On youtube |
|
|
|
Updates 18/03
Roaming update: There are simcards that when you level home nothing changes because you tunnel home (virtually nothing changes) Lyca NL, Lebara NL , MTN CY, and Lebara FR are tested to change the IP while roaming Check if while roaming it behaves the same as the partner (open research question) Server right now:
|
update idea to use more external IPv4 addresses on your server. That means expanding your testing infrastructure with probing from multiple addresses. Can you start measuring for a while from 1 address and predict what the other address will see as port mapping? {hope this is understandable}.
|
Currently gathered Belgian and Norwegian data for this week and fixed the bugs in the server that was causing it to crash. Updated the Paper with some changes on the measurements used and data gathered. I believe right now there are good enough number of sim cards in my possession and I'll focus on analyzing the results of this sprint. Todo:
First_5G_deployment_of_Distributed_Artificial_Intelligence.pdf Planning to charter a private Piper Aircraft soon to do a sim card run in another EU member state |
|
Updates last sprint:
Updated thesis:
Next Sprint:
|
|
Thesis defense target: 21 June 2024. Survey target: end of July 2023.
Would like to have a fresh master thesis topic, not incremental improvement of other thesis work.
Starting roughly Q1 2023 or summer of 2023, flexible. update: starting lit. survey 2nd May
update 2: literature survey finished: 3 oct 2023.
RTOS expertise. AWS. Dream of contributing to The Linux Kernel. Byte-level stuff OK, even assembly person in the age of Javascript :-) Like to use machine learning, but not invent new ML stuff or central focus of thesis (no unsupervised learning, no online learning). Thus more ML that is: adversarial, byzantine, decentralised, personalised, local-first AI, edge-devices only, low-power hardware accelerated. Prefer to utilise advanced algorithms msc course knowledge.
Possible brainstorm starting idea: start building the fastest machine learning based on hardware acceleration. First step is get the hardware running fast, stepwise modify algorithms and tweak towards machine learning for learn-to-rank, learn-through-consumption, or even learn-about-trust (reputation graph, work graph, MeritRank inspired etc). Promised phones to test.
https://rct.doj.ca.gov/Verification/Web/Download.aspx?saveas=560291.pdf&document_id=09027b8f803a8976 [source]
Pure P2P networking for 5G. Second direction is building the world-first overlay network exclusively for mobile devices. No PC, laptop or server allowed. Related: NAT puncturing infrastructure #2754 plus practical work to get 256 reliable neighbors: msc placeholder: daos, scams, FROST, message drop, something #7074 (comment)
literature survey: read everything about carrier-grade NAT and think of 5G context. Prior 2019 work: Universal communication using imperfect hardware](Universal communication using imperfect hardware #4827) it gave us IPv8-Kotlin, you work fix the final issues and make it the workhorse for the future Internet (in a survey?). NAT puncturing, birthday paradox. See also the binary transfer protocol, EVA issues. Nothing ambitious 😲
literature survey example from prior students
The text was updated successfully, but these errors were encountered: