Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fetch] Omit Content-Length and Content-Encoding headers for compressed responses #2514

Open
Skn0tt opened this issue Dec 8, 2023 · 12 comments
Labels
enhancement New feature or request

Comments

@Skn0tt
Copy link
Contributor

Skn0tt commented Dec 8, 2023

This would solve...

TL;DR: Removing Content-Length and Content-Encoding headers when decompressing responses makes proxying a lot easier, but is probably a breaking change.

When fetching a compressed resource, one receives a Response object with:

  • a decompressed body
  • a content-length header containing the length of the compressed representation
  • a content-encoding header that says the response is encoded

Headers are proclaiming that the body is compressed, but the body is decompressed!

When proxying this response back to a client without adapting the headers, this means the body stream will be longer than the content-length header proclaimed, and an invalid HTTP message is sent. HTTP Clients like cURL throw an error on this.
The same is true for content-encoding - clients will try to decompress the body, but it's not compressed.

Expand to see a repro 🐱

This server proxies a gzip-compressed SVG Logo.

import { createServer } from "node:http";
import { pipeline, Readable } from "node:stream";

createServer(async (req, res) => {
  const resp = await fetch("https://www.gatsbyjs.com/Gatsby-Logo.svg");
  const headers = new Headers(resp.headers);

  // 👇 workaround 👇
  // if (headers.has("content-encoding")) {
  //   headers.delete("content-encoding");
  //   headers.delete("content-length");
  // }

  res.writeHead(resp.status, [...headers.entries()]);

  pipeline(
    Readable.fromWeb(resp.body),
    res,
    (err) => err && console.error(err)
  );
}).listen(3000);

Requesting it in cURL without the workaround results the Excess found in a read error, and the connection is closed before the full SVG is transmitted.

After adding the workaround, it works fine.


As can be seen in the repro above, one workaround is to remove the problematic header from the response. This reconciles the response, because the uncompressed .body is now accompanied by headers that correctly describe it as being uncompressed.

Having this check within business logic is cumbersome, it should live in library code. Ideally, it would live in frameworks, but there's a problem: Based off nothing but a Response object, it's impossible to differentiate a response produced by fetch (content-encoding header, but uncompressed body) from a compressed Response created in user code (same content-encoding header, but compressed body). So without further additions to the Response class, it's impossible to perform this reconcilation after the Response is created.

Instead, this should live within fetch.

The implementation should look like...

Add headersList.delete("content-length"); headersList.delete("content-encoding") around here:

undici/lib/fetch/index.js

Lines 2158 to 2178 in b918b7a

for (let i = 0; i < codings.length; ++i) {
const coding = codings[i]
// https://www.rfc-editor.org/rfc/rfc9112.html#section-7.2
if (coding === 'x-gzip' || coding === 'gzip') {
decoders.push(zlib.createGunzip({
// Be less strict when decoding compressed responses, since sometimes
// servers send slightly invalid responses that are still accepted
// by common browsers.
// Always using Z_SYNC_FLUSH is what cURL does.
flush: zlib.constants.Z_SYNC_FLUSH,
finishFlush: zlib.constants.Z_SYNC_FLUSH
}))
} else if (coding === 'deflate') {
decoders.push(zlib.createInflate())
} else if (coding === 'br') {
decoders.push(zlib.createBrotliDecompress())
} else {
decoders.length = 0
break
}
}

This is probably a breaking change for some usecases, so it should probably be released under flag and rolled out in a major release.

I have also considered...

There's a userland workaround I mentioned above, but it has to live within frameworks code.

Alternatively, API frameworks could monkey-patch the global fetch, but this is ugly for all sorts of reasons.

The best solution would be to add a Response#compressedBody field that contains the raw body sent over wire, without going through decompression. In a proxy usecase, this means that both decompression and compression can be skipped if the upstream service supports the compression mechanism accepted by the user-agent. This would be a much bigger API change, though. One of the Undici maintainers opened whatwg/fetch#1524 about this a year ago.

Additional context

In the Fetch Spec, it's explicitly called out that the current approach makes Content-Length unreliable:

This makes the Content-Length header unreliable to the extent that it was reliable to begin with.
~ Fetch Spec, 16.1.1.1.4

I'm unsure about how to interpret this, but it could be seen as a reason to drop the Content-Length header entirely. If it's unreliable, why include it?

Even if the proposed behaviour is not fully inline with the spec, I think the proxying usecase is good enough of a reason to deviate from it, similar to the other things mentioned in https://github.com/nodejs/undici#specification-compliance.

The proposed behaviour is what Deno implements as well.

@Skn0tt Skn0tt added the enhancement New feature or request label Dec 8, 2023
@Skn0tt
Copy link
Contributor Author

Skn0tt commented Dec 8, 2023

This issue is related to #1462, but slightly different.

It's getting more relevant with frameworks like Next.js and + hosting providers like Netlify moving towards a Request/Response-based API Handlers.

In Next.js for example, the Route handler below looks innocent, but will break in production because of this issue.

export const GET(req: Request) {
  return fetch("https://www.gatsbyjs.com/Gatsby-Logo.svg")
}

@ronag
Copy link
Member

ronag commented Dec 8, 2023

What does the spec say? This might be more of a WinterCG issue?

@Skn0tt
Copy link
Contributor Author

Skn0tt commented Dec 8, 2023

As I outlined under "Additional Context", from my reading of the spec, it doesn't specify the behaviour around this. WinterCG's fork of the spec also doesn't have changes in the respective section.
If you believe this should go through WinterCG first, i'm happy to present this case to them.

@KhafraDev
Copy link
Member

I understand the issue but I don't think changing this is worth it considering there is a very easy way workaround that you mentioned. The spec doesn't tell us to remove these headers, which makes sense because proxying isn't a concern for browser fetch. There's no solution where we stay in bounds with the spec, and therefore other platforms.

@Skn0tt
Copy link
Contributor Author

Skn0tt commented Dec 8, 2023

I wouldn't consider the workaround easy. Compression is not top-of-mind for most developers, this is typically left to frameworks and hosting providers. I've outlined that they can't apply the workaround on their end, though.

Could you elaborate on the browser fetch argument? My understanding is that undici is designed for usage in Node.js, where proxying is a concern, and not for other platforms.

@KhafraDev
Copy link
Member

undici is, fetch isn't.

@KhafraDev
Copy link
Member

if you only want to support node, I'd recommend undici.request which doesn't decompress the body for you

@Skn0tt
Copy link
Contributor Author

Skn0tt commented Dec 8, 2023

What i'm reading out of your messages is that removing the content-length and content-encoding headers is incompliant with the fetch spec, so undici can't implement that. Is that reading correct?

If we brought this case to WinterCG and they end up making a change to their fork of fetch spec, would undici follow that?

@KhafraDev
Copy link
Member

  • Yeah, mostly. The few places we do diverge from the spec have caused a bunch of issues for us already. Extending the spec isn't something we should do.
  • 🤷 If there's a non-invasive proposal for getting the raw body, I wouldn't be against it. I'd have to see it before giving a definitive answer though.

@ronag
Copy link
Member

ronag commented Dec 8, 2023

We try to follow WinterCG

@Skn0tt
Copy link
Contributor Author

Skn0tt commented Dec 8, 2023

Sounds good! I'll try to get the thoughts of WinterCG on this.

@KhafraDev
Copy link
Member

I don't think we should give a definitive answer on that... there hasn't been a case where we explicitly changed undici to conform with wintercg.

While at some point in the not too distant future WinterCG may define formal "compliance guidelines" for specs such as the Minimim Common API, it does not do so currently, which means there's no official definition of "WinterCG-Compliance". And even if there were, it would be entirely opt-in/optional for any particular runtime. This means effectively that displaying the logo on the website could never be interpreted as meaning "Node.js is expected to be in compliance with any WinterCG document or specification". I think this is important to clarify as I want to avoid any confusion.

nodejs/node#50668

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants