Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can headers ordering be lessened? #189

Closed
youennf opened this issue Jan 5, 2016 · 21 comments
Closed

Can headers ordering be lessened? #189

youennf opened this issue Jan 5, 2016 · 21 comments

Comments

@youennf
Copy link
Collaborator

youennf commented Jan 5, 2016

Headers currently store header name value pairs as an ordered list.
If the ordering is lessened to keep only ordering between headers that have a same name, a more efficient structure could be used (ordered multimap).

In non service-worker cases at least, there seems to be little value to keep that strong ordering.

This is a follow-up of discussion from #154

@annevk
Copy link
Member

annevk commented Jan 5, 2016

Well, I think whatever we do the order needs to be defined since it is observable. Or defined to be random, but that seems suboptimal. Curious to hear what others think with respect to requiring a more efficient order.

@wanderview
Copy link
Member

@youennf, do you have a use case where the optimization of this data structure has a measurable impact?

@youennf
Copy link
Collaborator Author

youennf commented Jan 5, 2016

No real use case on my side, although the fetch algorithm often uses headers get(), which is an indication that it should be made fast enough by browsers.

WebKit already has a headers structure that does not preserve insertion order.
This means implementing a new structure and regularly translating one into the other or vice-versa. A small overhead probably.

Lexicographical ordering would solve this although this may hinder some other cases?

@wanderview
Copy link
Member

The vast majority of Headers objects will have ~10 headers or less. Simple lists are typically faster for those numbers of items than hash tables, etc. The overhead of the heavier data structure outweighs any benefit from the better algorithmic complexity.

I defer to @annevk about dropping a content exposed behavior like strict ordering, but it does not seem worth making a change here to me.

@youennf
Copy link
Collaborator Author

youennf commented Jan 5, 2016

Some additional points:

  1. Having an ordering that can be computed from the headers themselves allow implementations to select whatever containers they like
  2. A web application will receive response headers consistently accross all engines
  3. In the case of a list container, sorting may actually be only needed when triggering iterable from JS (which might never happen for a given header set) or as a way to ease getAll implementation

Overall, I agree this is not a big issue.

@annevk
Copy link
Member

annevk commented Jan 6, 2016

Thinking about it some more, I don't understand how this hash table alternative deals with duplicates that cannot be combined, e.g., Cookie.

@youennf
Copy link
Collaborator Author

youennf commented Jan 6, 2016

Currently it cannot and this needs to be fixed.
I guess a linked list of strings could be used to store values.
Related WebKit discussion happens at https://bugs.webkit.org/show_bug.cgi?id=152384

@annevk
Copy link
Member

annevk commented Jan 7, 2016

That would make it even more complicated compared to the current list. You would have a hash table of name : values mappings and then an individual value would have some flag to indicate it needs to be serialized as its own header. The current design might not be so bad, all things considered.

@youennf
Copy link
Collaborator Author

youennf commented Jan 7, 2016

If container is an array, I guess the only needed thing would be to sort it when JS iteration is needed.

As mentioned in WebKit entry, doing a set is different from doing a remove-then-append operation, which seems a bit strange.

That said, after some WebKit internal discussions, we seem to be heading towards using a list container.
If so, I am fine with the current order.

@annevk
Copy link
Member

annevk commented Jan 7, 2016

The way the Headers object works, and not at all coincidentally, FormData and URLSearchParams objects do too, is that they are somewhat like a map, but can also be used as a multimap-like. So either you use get/set or you use getAll/append. delete/has work well either way.

While this is a little strange, it provides a simple map-based API for the common case, but if you need to deal with the uncommon case, that's taken care of too. And it's a multimap-like, since for some of these the ordering of pairs matters too.

@annevk
Copy link
Member

annevk commented Jan 7, 2016

Anyway, closing this given your comments.

@annevk annevk closed this as completed Jan 7, 2016
@hober
Copy link

hober commented Jan 14, 2016

I think we should probably revisit this. WebKit uses CFNetwork for network requests. CFNetwork uses a map for headers and is unlikely to change.

@annevk
Copy link
Member

annevk commented Jan 15, 2016

@hober how does it address the issues I mentioned? Could you expand on what kind of data structure that map is?

@annevk annevk reopened this Jan 15, 2016
@youennf
Copy link
Collaborator Author

youennf commented Jan 15, 2016

I think this data structure is a straight <name, value> map, values being combined when trying to add headers with a same name.

It supports delete, has, set (and combine which is not exposed in the fetch API).
It does not support append() nor insertion ordering iterable access.
get() and getAll() are a bit redundant with this data structure.

In WebKit, there is a generic headers structure used by different backends.
Currently it has the same model as CFNetwork.
https://bugs.webkit.org/show_bug.cgi?id=152828 is discussing the possibility to change that structure to fit the fetch API model.

Part of the discussion is why the fetch Headers API is exposing some information that is not meaningful at the network level, exposing this information making it potentially meaningful at the web application level.

@annevk
Copy link
Member

annevk commented Jan 15, 2016

Well, HTTP does have a concept of multiple values that apparently CFNetwork ignores and instead always uses combining semantics. And, due to cookies, HTTP has a concept of multiple headers. How does CFNetwork deal with that? There must be some special casing somewhere.

@youennf
Copy link
Collaborator Author

youennf commented Jan 15, 2016

Is Set-Cooke the only case?
The HTTP spec mentions that: recipients ought to handle "Set-Cookie" as a special case while processing header fields.

That questions whether the fetch API should reflect that notion of multiple values in the API or whether that should remain a special case.
If "set-cookie" is the only case and should be supported by the fetch API, one could translate the recommendation of the HTTP spec as using a dedicated API for it and not the general purpose header API.

@annevk
Copy link
Member

annevk commented Jan 15, 2016

All headers can have multiple values, in theory.

Name: value, value2, value3

is equivalent to

Name: value, value2
Name: value3

I thought either Cookie or Set-Cookie or both had the additional restriction that their values cannot be combined.

@youennf
Copy link
Collaborator Author

youennf commented Jan 15, 2016

Let's take that header set as example:
Name: value1, value2
Name: value3

If the header set is part of a response, how will be presented this data to a web application by the fetch API?
Is it the case that some web engines will present to web applications a response header structure with two entries while some other web engines will present to web applications a response header structure with one entry? Or is one correct and not the other?
If both are ok, one web engine may return "value1, value2" when calling get("Name") while the other will return "value1, value2, value3". getAll is similar in spirit here ([[value, value2], value3] vs. [[value1, value2, value3]]).

If the header set is part of a request, the corresponding request at the protocol level can use one or two headers, it will be semantically equivalent. What is the benefit of enabling a web application query whether there is one combined header or two headers with the same name?

@annevk
Copy link
Member

annevk commented Jan 27, 2016

In https://lists.w3.org/Archives/Public/www-archive/2016Jan/thread.html#msg6 several of us decided to change this API. This has the consequence that the API can never work with cookies, but that is deemed acceptable.

@wanderview
Copy link
Member

@hsivonen
Copy link
Member

Filed a follow-up to tweak the sort order on the XHR side in a case where Gecko and WebKit are now incompatible with widely-deployed buggy JS and Chrome works due to not implementing the sorting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants