Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a fixed destination port in UDP mode (paris-traceroute) by default? #117

Open
zorun opened this issue Jun 9, 2016 · 5 comments
Open

Comments

@zorun
Copy link

zorun commented Jun 9, 2016

In UDP mode, the default is to use the destination port to encode the sequence number of each probe. But it causes issues with load-balancers, since they often hash the packet header (src IP, dst IP, protocol, src port, dst port) to choose a path.

There's something called Paris-traceroute that fixes that: https://www.paris-traceroute.net/about
This feature has already been implemented in mtr, by aaccfc0 and following commits.

Now here is the problem: this Paris-traceroute feature is only used when a specific destination port is used (e.g. -P 80).

I would like to see that become the default in UDP mode. If no destination port is specified (-P option), it can either use a fixed destination port, or a random destination port. The fixed port means that the traceroute is reproducible; the random port means that multiple runs of mtr towards the same destination would discover multiple load-balanced paths. I'm not sure which choice is the most relevant.

@russor what do you think?

@rewolff
Copy link
Collaborator

rewolff commented Jun 9, 2016

My vote is to pick a new random one on every invocation. Things that ARE random, should be discoverable. Suppose one of the "load sharing" routes is "bad". Then if the load balancer and mtr's default portnumber cause the packets to go over one of the working routes, then mtr will never "see" the problem. By chosing a "random" portnumber at startup at least you have a chance of seeing the problem.

("Look boss, I found the problem.. Here see this mtr traces .... Hey, now it works.... ", where of course murphy's law made sure that the first trace of the network tech showed the problem. Anyway, in such a situation the problem is discoverable, whereas with a static "random" port it is not. Then for example, the hash would cause it to work from some source IPs but not others. This points in the wrong direction. )

@zorun
Copy link
Author

zorun commented Jun 9, 2016

Yes, this makes sense. But you could also argue that by making mtr's behaviour unpredictable, it's even more confusing for the user :)

Btw, I forgot that the UDP source port is already selected at random by the OS, so there's not much point in fixing the destination port.

Anyway, whatever the default choice, for "serious" investigation, you would explicitly select the source and destination ports, and increment one or the other to discover all paths.

@rewolff
Copy link
Collaborator

rewolff commented Jun 9, 2016

But "users" which in this case are "sysops" will get complaints like: "that site is so slow today, go and fix it for me". Then having mtr as the diagnostic tool not being able to diagnose the problem is annoying.

If the source port is often hashed along, that doesn't mean it always happens. Likely, but maybe not always. (You need just one "load distribution machine" company somewhere that has had a complaint of a case where hashing the source caused problems. )

I much prefer that mtr then reproduces the "sometimes it works, sometimes it doesn't" that the actual users are seeing. Once the sysop determines: "Weird, sometimes it works sometimes it doesn't", it is his task to find the cause of that: Fixing the UDP ports (source and dest) wil make it "consistent" across invocations of mtr, leading him towards the correct conclusion.

@russor
Copy link

russor commented Jun 9, 2016

I would prefer not to change the defaults -- traditional and Parisian udp probing both have merit, and it's not a great user experience for the defaults to change in a drastic way. I think the current behavior that leads to multiple intermediate hosts per hop being shown let's people know something is going on and they can try to figure it out. Maybe some hints in the man page would be in order?

I had experimented with displaying additional data per hop when there were multiple hosts: a total line, and then individual received count and ping stats; but being able to fix the ports (and some external logic to find ports of interest) and run multiple mtr's simultaneously was more useful to pinpoint loss. I would also love a mode where paths were probed and then stats for the end to end latency/loss were shown per path, but that's out of my league :)

In some cases, Parisian probing will result in significantly less data as well:

Some people may have firewall rules in place to permit the traditional udp probes with traditional ports, which may not be satisfied by new style probing.

The probe packet may have been mangled in transit, resulting in the checksum being altered, and when the ICMP comes back, the checksum doesn't match what mtr is expecting: Since submitting my patches, I've seen this when the probing machine is behind a NAT, but might happen for other reasons? I've thought about moving the counter into the payload, although we're not guaranteed to get any of the UDP payload back in the ICMP.

@jorhett
Copy link

jorhett commented Dec 29, 2020

I just want to voice my support for not changing the defaults. However, one thing that might likely help those who want it simpler would be an easy flag that enables paris mode, as only us network geeks will look at those options and just go "oh, I know how to use this"

sudo mtr galaxy.ansible.com --udp --port=33434 --localport=33666

So an option like --fixed-udp-ports or --paris-this-thang 😛 might make it easier for people

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants