Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some sites break because of req.url being absolute #529

Closed
fabiosantoscode opened this issue Dec 2, 2013 · 15 comments
Closed

Some sites break because of req.url being absolute #529

fabiosantoscode opened this issue Dec 2, 2013 · 15 comments

Comments

@fabiosantoscode
Copy link

According to the HTTP standard, a request's URI (the second field in the first line of an HTTP request, after the verb, and which gets into req.url) may be relative or absolute.

A request to a proxy must always have an absolute URI, and indeed when we receive a request, req.url is always absolute. We just forward the request in proxyRequest and copy the URI into the new request.

The receiving end of our new request will in turn get an absolute URI. This behaviour, however standard, is breaking some sites which naively expect the path to be relative to the hostname (starting with a "/"). They are wrong, but still node-http-proxy could be "fixed" so as to not break this assumption of theirs.

By placing "req.url = req.url.replace(/.?:https://.?//, '/')" (an ugly regexp, granted) before calling proxyRequest I was able to conform to the naiveté of Wordpress and other stuff which is out there being used and does not implement the standard.

Thanks for building this nice, flexible proxy.

@fabiosantoscode
Copy link
Author

Here is my changeset to my front-end dev tool using node-http-proxy which fixes this assumption.

fabiosantoscode/magicProxy@d8118e3

@boutell
Copy link

boutell commented Dec 10, 2013

This is also the cause of incompatibility with socket.io.

@boutell
Copy link

boutell commented Dec 10, 2013

Add "all express sites using the express.sessions middleware," also known as "all sites using the connect sessions middleware."

if (0 != req.originalUrl.indexOf(cookie.path || '/')) return next();

That is trying to detect a leading / in the default configuration. It'll never happen with an absolute URL.

There are just too many commonly used frameworks, and therefore sites, that don't work with this behavior.

@boutell
Copy link

boutell commented Dec 10, 2013

@fabiosantoscode Thanks for that simple solution, it should have occurred to me since I'm doing custom proxying with proxy.web too.

Here's a better regexp that should not experience false positives:

req.url = req.url.replace(/^\w+:https://.*?//, '/');

@jcrugzz
Copy link
Contributor

jcrugzz commented Dec 10, 2013

@boutell and @fabiosantoscode if you haven't already, check out the caronte branch as that will be node-http-proxy 1.0 in the near future.

@boutell
Copy link

boutell commented Dec 10, 2013

Jarrett, I'm experiencing this with the caronte branch. It is a serious
enough bug (and easy enough to fix) to be worth holding 1.0 just a minute
longer I think.

On Tue, Dec 10, 2013 at 12:36 AM, Jarrett Cruger
[email protected]:

@boutell https://github.com/boutell and @fabiosantoscodehttps://github.com/fabiosantoscodeif you haven't already, check out the
caronte https://github.com/nodejitsu/node-http-proxy/tree/carontebranch as that will be
node-http-proxy 1.0 in the near future.


Reply to this email directly or view it on GitHubhttps://github.com//issues/529#issuecomment-30201042
.

Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com

@fabiosantoscode
Copy link
Author

Although it is indeed serious, I don't think it is trivial to fix. After
all, this proxy can forward the request to another proxy (and in that case
should send the absolute URL) or to a server (and in that case not send the
absolute URL, to fix their assumptions).

I will try out the caronte branch.

@boutell
Copy link

boutell commented Dec 10, 2013

That is true, there needs to be appropriate behavior based on whether this
is a forward to another proxy or not, and that means proxy.web has to be
told that, and right now there's no provision to tell it so. So it's an API
change.

Trivial or not though, I don't think it makes sense to ship a "1.0" version
of a proxy that demonstrably will fail with Express sessions, socket.io and
lots of non-node technologies in frequently encountered combinations.
Better to hold up and address it.

On Tue, Dec 10, 2013 at 7:28 AM, Fábio Santos [email protected]:

Although it is indeed serious, I don't think it is trivial to fix. After
all, this proxy can forward the request to another proxy (and in that case
should send the absolute URL) or to a server (and in that case not send the
absolute URL, to fix their assumptions).

I will try out the caronte branch.


Reply to this email directly or view it on GitHubhttps://github.com//issues/529#issuecomment-30221465
.

Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com

@jcrugzz
Copy link
Contributor

jcrugzz commented Dec 10, 2013

@boutell that was not my implication, I just wanted to make sure you guys were trying the newest code :). Im sure @yawnt will be on it. If you can post a gist of the smallest reproducible case, this will be extremely helpful in developing a test and figuring out a solution!

@fabiosantoscode
Copy link
Author

Smallest reproducible case:

https://gist.github.com/fabiosantoscode/349a8d29d3bdf109c78c

It's unfortunate that the API will be one option more complicated just because there's too much stuff not implementing this correctly.

We should report this problem when we see it in the real world.

@boutell
Copy link

boutell commented Dec 10, 2013

Real world cases cited here so far:

Wordpress
Express/Connect session middleware (many node-powered sites)
socket.io's built-in asset server

This is a duplicate of #416 which cites pages on dailymotion.

Re: the spec, that bit says in full:

"To allow for transition to absoluteURIs in all requests in future versions of HTTP, all HTTP/1.1 servers MUST accept the absoluteURI form in requests, even though HTTP/1.1 clients will only generate them in requests to proxies."

Emphasis on that last bit. The servers that don't like the absolute URI are most in the wrong here, but we're not doing all that hot either because we're generating absolute URIs when not talking to (another) proxy.

yawnt added a commit that referenced this issue Dec 18, 2013
@yawnt
Copy link
Contributor

yawnt commented Dec 18, 2013

apologies for the delay, should be fixed in 9e74a63

@fabiosantoscode i used the test case you provided and now it's correctly sending "/" instead of the full path :)

@yawnt yawnt closed this as completed Dec 18, 2013
@boutell
Copy link

boutell commented Dec 18, 2013

With this closed, do we need to open a new issue on support for being downstream from another proxy? That has the opposite issue: an absolute URL is required.

@yawnt
Copy link
Contributor

yawnt commented Dec 18, 2013

i would be happy to accept a pull request regarding that, or a test case which outlines the issue :)

@boutell
Copy link

boutell commented Dec 18, 2013

It doesn't affect me at all personally, but it seemed only honest to point
it out (:

On Wed, Dec 18, 2013 at 10:54 AM, yawnt [email protected] wrote:

i would be happy to accept a pull request regarding that, or a test case
which outlines the issue :)


Reply to this email directly or view it on GitHubhttps://github.com//issues/529#issuecomment-30852747
.

Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants