Username ending with MIME type format is not allowed

wcoenen · on Sept 15, 2021

I'm a little confused about the issue description, because "mov" is not a MIME type.

Examples of MIME types: "text/plain", "text/html", "image/png" "application/pdf", "video/quicktime", ...

If I was prevented from using the username "wcoenentext/html", then I wouldn't really be bothered by that. (Although I might question the design decisions that would necessitate such a restriction.)

scblzn · on Sept 15, 2021

Hello,

I’m the author of the issue on Gitlab (small world, isn’t it ?)

Yes the message is confusing and I agree that .mov isn’t a MIME type but I was merely reporting the error message shown ( plus, they added .mov in their list of file types and had aliased it to .mp4 format, please see: https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/in... )

codetrotter · on Sept 15, 2021

> they added .mov in their list of file types and had aliased it to .mp4 format

That’s weird. Why’d they do that. They should make a separate entry for mov and associate it with video/quicktime

Guess it might be something related to https://stackoverflow.com/a/44785870 but like they point out, mov is a container format that can contain one of many different codecs used. And isn’t mp4 just a container too? Referring to mov files as video/mp4 seems straight up incorrect to me

robertony · on Sept 15, 2021

Modern mov files are just mp4 containers.

codetrotter · on Sept 15, 2021

I don’t think that’s quite right is it? As I understand, mp4 is based on mov. But they still very much are distinct container formats, and an implementation of the mp4 standard would likely not be able to correctly read mov files as is, would it?

banana_giraffe · on Sept 15, 2021

They're very close. The ISO base media file format was directly based off of QuickTime container format.

If you look at a .mov file and a .mp4 file in a ISO bmff viewer, you'll generally see the only difference is the ftyp box is different ("qt " for .mov, "isom" for .mp4). Indeed, if you ask ffmpeg to make a .mov file and .mp4 file of the same content, literally the only difference is the contents of the "ftyp" box, every other byte is identical.

gyan · on Sept 17, 2021

> literally the only difference is the contents of the "ftyp" box, every other byte is identical.

Not quite. There are some boxes acceptable in one but not the other. Strings inside MOOV are length-prefixed in MOV but null-terminated in ISOBMFF. There are a variety of differences like that.

The set of codecs allowed, also differs between the two.

wolfd · on Sept 15, 2021

They're pretty much interchangable. Video frameworks like gstreamer just give you qtdemux to parse .mov, .mp4, and even .m4a or .m4v (which are just MP4s with different file extensions).

john_cogs · on Sept 15, 2021

A GitLab team member has opened a merge request to make the error message more clear: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/70374/...

splintercell · on Sept 15, 2021

Instead of saying with ‘a file extension’, It should say with ‘a reserved file extension’.

john_cogs · on Sept 15, 2021

Do you want to open a Merge Request to propose the change? Would love to have you contribute to GitLab.

dnsmichi · on Sept 15, 2021

Went ahead and created a MR to emphasize on 'reserved file extension' in the error message: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/70427

Thanks for the suggestion :)

eyelidlessness · on Sept 15, 2021

Damn and I was about to start registering usernames like for json/bourne but I guess that’s overly specific.

waspight · on Sept 15, 2021

That will be the name of my next child.

anonymousiam · on Sept 15, 2021

Better than Bobby Tables.

https://xkcd.com/327/

pulse7 · on Sept 15, 2021

Please don't do that... Elon Musk's son is named "X Æ A-12", don't follow his steps...

ben_w · on Sept 15, 2021

Musk wasn’t the first to try to use weird names, but he didn’t quite succeed with that one. Quoth wiki:

"""the name would have violated California regulations as it contained characters that are not in the modern English alphabet,[321][322] and was then changed to "X Æ A-Xii". This drew more confusion, as Æ is not a letter in the modern English alphabet.[323] The child was eventually named "X AE A-XII", with "X" as a first name and "AE A-XII" as a middle name.[324]"""

bloak · on Sept 15, 2021

> with "X" as a first name and "AE A-XII" as a middle name

It sounds as though they allowed the middle name to contain a space, which doesn't match my mental model but perhaps that's how they do it in California. It invites the question: do they allow a first name to contain a space, so that ("X", "AE A-XII", "Last Name") and ("X AE", "A-XII", "Last Name") are different names?

EDIT: The funny thing isn't allowing spaces in a middle name; it's have a separate field for middle name(s) at all. Modern passports have just two name fields: "surname" and "given names". Both may contain spaces. But according to images on the web, Californian birth certificates really do have three name fields.

hollowcelery · on Sept 15, 2021

Any name can contain a space. For example "Ana Maria" is a common first name which contains a space. On official documents, generally a name will be separated into a given name and surname. In this case "<A> <B C>" and "<A B> C" are considered separate names.

Source: I have a space in my name and some of my different identity documents have the name as "<A> <B C>" or "<A B> C", which causes all sorts of administrative problems.

CydeWeys · on Sept 15, 2021

Leonardo da Vinci is another famous example of a last name with two words in it. It's very common in romance languages. Plus, lots of people in the American south just flat out have two first names or two middle names.

addingnumbers · on Sept 15, 2021

Da Vinci isn't his last name, just like Jesus's last name isn't "of Nazareth" and Cato the Elder's last name isn't "the Elder"

an_ko · on Sept 15, 2021

Last names in many places evolved from that same need to disambiguate between people though. Attach some marker of connection to a place (common in Finland, e.g. Joensuu meaning "mouth of river"), profession (common in Germany and UK, e.g. Müller, Cooper, meaning mill worker and barrelmaker), lineage (common in Iceland, e.g. Grímsson meaning "son of Grímur"), or some other culturally relevant characteristic.

Nowadays the meanings of our last names have largely disappeared, so you have countless Coopers who have never touched a barrel in their lives, whose children will be called Cooper also, despite that. I think it's a little sad that so much of what people call us is semantically equivalent to a random UUID with tons of namespace collision. With that in mind, I'd say "da Vinci" is more a last name than most of us have.

Ajef · on Sept 15, 2021

Having a name with "of Region/city/former kingdom/..." is often their last name. Unless you want to claim that these people do not have a last name.

I can understand that in some cultures this might seem weird or antiquated but here in Germany these names are reality. Sometimes people with such names are descendents of royalty and sometimes someones last name "from family-name" is thier last name and happens to historically correspond to one of germany's state names or city names or just a little town.

One a side note: In Germany in 1919-1920 royalty was no longer a legal aspect that changed how laws applied to you[1]. When that happened titles that were reserved for ruling functions (king, grand duke) were removed and all other titles were moved to be part of the persons name (such as prince etc.) and could not be decreed on anyone new. These titles still exist in Germany but are simply naming "conventions" in a formerly royal family.

[1] https://de.wikipedia.org/wiki/Adelsrecht

[edit] In Leonardo's case perhaps not but still i wish to elaborate a litte on the situation here.

bityard · on Sept 15, 2021

When speaking about him in English, why do we say "Da Vinci" instead of "of Vinci"?

nitrogen · on Sept 15, 2021

We'd probably have to say Leonard of (Anglicized form of 'Vinci') for maximum consistency in that case. Lenny Vince for short.

IncRnd · on Sept 15, 2021

c'est la vie /s

Seriously, though, it just feels apropos.

IncRnd · on Sept 15, 2021

Leonardo da Vinci's name was Leonardo, and "da Vinci" refers to Leonardo's birthplace.

IntrepidWorm · on Sept 15, 2021

Correct: his full name as given was "Lionardo di ser Piero da Vinci," meaning more or less Piero's son Lionardo from Vinci. Deriving a childs name from their lineage was incredibly common.

IncRnd · on Sept 15, 2021

Lineage names still exist in the West, but lineage naming is not as common. See Ken Thompson, or even Johnson & Johnson's vaccine!

hunter2_ · on Sept 15, 2021

Documents aside, do you consider <B> to be a given name (parent came up with it) or a surname (parent already had it)? If given, do you consider it optional (i.e., middle rather than first)? Sorry in advance for any shortsighted assumptions about the possibilities here!

mdaniel · on Sept 15, 2021

Relevant and almost as horrifying as the address one: https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-...

bloak · on Sept 15, 2021

While we're on the topic...

A UK passport has a "surname" field and a "given names" field. But a recent UK birth certificate has a single name field in which the "surname" part is distinguished by being written in capitals, so "Peter James ADAM SMITH" would have a surname consisting of two words. But what if a word consists of a single letter? For example, some Irish surnames look like "O Briain", and I think there is a Vietnamese name that consists of a single vowel, so presumably you can't always tell from the birth certificate which part is the surname.

occamrazor · on Sept 15, 2021

In Italy there is First Name, Other Names, and Family Name on the birth certificate. All fields can contain spaces (as well as hyphens, apostrophes, and some other diacritics).

The legal name however consists of First and Family names only, without the other names. Therefore many people have two names in the First Name field, usually separated by a space. The disadvantage is that in all official forms they have to spell out all the first name(s), no omissions or abbreviations are generally admitted.

jhugo · on Sept 15, 2021

One nice side-effect of this is that you can have a bit of fun with the Other Names since they're "unofficial". If the child wants to use those names later, they can, if they don't, they can pretend they don't exist. Some friends of ours put "Danger" in their kid's "other names", for example.

codetrotter · on Sept 15, 2021

Multiple middle names are common in Sweden.

Pinus · on Sept 15, 2021

The Swedish situation is... confusing. :-) There are first names ("förnamn"), possibly more than one. The one in daily use — not necessarily the first one! — is called "tilltalsnamn" (something like "addressing name"), and is traditionally marked in official paperwork by underlining or with an asterisk. (You know someone has not done their homework when you get junk mail that starts with "Dear <Wrongname>!) Then there used to be middle names ("mellannamn"), which was something put between the first and last names. These were typically used e.g. by people who wanted to have both their own and their spouse's name. These are no longer issued, though those who have them can keep them. Instead, you can now have a double last name, which used to be impossible. (People have sported "double-barrelled surnames" for ages, but they have not, as far as I can understand, been officially recognised, but functioned more like "stage names").

pjmlp · on Sept 15, 2021

Not really, it is quite similar to sourthern countries.

Typical Portuguese names have around 5 names, two first names and three surnames from both parents.

In fact your description fits quite well how they are used in Portugal, plus a few other nuances.

Pinus · on Sept 15, 2021

The confusing bit is that the rules about middle names and double surnames have changed at least twice in... my lifetime, which is becoming much longer than I care to think about. :-) I didn't do a very good job of conveying that in my comment.

hibbelig · on Sept 15, 2021

Is "middle name" a separate concept in Sweden? In Germany, there is no middle name, but people can have more than one first name.

So, Heinrich August Schmidt has a last name / family name (Schmidt) and two first names / given names.

wink · on Sept 15, 2021

Small nitpick, if you are* in Germany but have an e.g. Spanish name (e.h. Hector Garcia Gonzalez) then... that's 2 surnames without a dash. I have no idea what happens if you marry or have children, but your example is just the -most basic- version.

*"you are" meaning you'd be a German citizen with a German passport.

rglullis · on Sept 15, 2021

> I have no idea what happens if you marry or have children

When you marry, you get to define what is going to be the family name and then children born from that marriage get to be registered with that family name.

In the case of foreign-born people, they can keep the naming rules from their original country. In the case of the marriage between foreigners from two different countries, you have to choose which rules are you going to follow, but the family name stays fixed.

To us (Brazilian marrying a Greek) it was a very interesting process. I have two last family names, and Greek names are gender-conjugated (i.e, the last name changes whether you are a boy or a girl). It the end the simplest thing to do was to just keep only one my last family names.

jannes · on Sept 15, 2021

My German children have a last name with a space in it (ie. 2 last names, but not with a dash). This also shows up in their German passport.

This is possible because you can make a “name declaration” where you choose to apply the naming law of another EU country.

n99 · on Sept 15, 2021

There are two different definitions of "middle name". Almost every American middle name is a given name or a surname used in the middle.

0x000000001 · on Sept 15, 2021

Had a set of twins in my town whose parents were Greek and they had two middle names and I was so fascinated about it as a kid I'll never forget.

samhw · on Sept 15, 2021

Is it that rare? I have two middle names (William Howard) and I never thought much of it. As a kid, I was more amused by a friend of my sister's who had a quadruple-barrelled surname...

tatersolid · on Sept 15, 2021

George Herbert Walker Bush wasn’t a familiar name in your youth?

oriolid · on Sept 15, 2021

Is there a limit that a person can have only one middle name?

denton-scratch · on Sept 15, 2021

In practice, yes.

I have two middle names; most financial service providers decline to recognise my second middle-name. Same goes for the taxman and my pension provider. It seems that a "full name" is no longer a canonical identifier for a person, just really a kind of nickname; the canonical identifier is now an account-number, employee-ID or whatever.

michaelt · on Sept 15, 2021

Thanks to marriage, names were non-canonical before the computer was even invented.

IIsi50MHz · on Sept 15, 2021

And thanks to people moving house, or even changing profession, or acquiring some defining trait or accomplishment. With most people being undocumented throughout history, names could be much more fluid, like the way a person's "nickname" may change many times with or without consent.

IncRnd · on Sept 15, 2021

No. There is not limit. That can by proven by seeing people with multiple middle names or by naming your child with multiple middle names. How could there possibly be a limit? Even with the case where systems can not store or recognize the length of such a name, the middle names themselves are not limited. People's names are not entries in a compiler's symbol table.

bityard · on Sept 15, 2021

My son has two middle names (one for each grandfather), but he's still very much a minor and so not in very many "systems" yet.

(Although probably a lot more than I know about.)

pulse7 · on Sept 15, 2021

<sarcasm>He must really love his baby more than his weird ideas...</sarcasm>

acheron · on Sept 15, 2021

> as Æ is not a letter in the modern English alphabet

Someone tell the Encyclopædia Britannica.

xqyf · on Sept 15, 2021

Tell them that their name is spelled in Latin?

Or that æ is a character formed by combining two separate letters?

Both support the idea that Æ is not a letter in the modern English alphabet.

ta988 · on Sept 15, 2021

I wish good luck to their kid for their visa applications and forms in general especially if they travel to Europe.

dahfizz · on Sept 15, 2021

I'm sure his billions of dollars will be able to straighten out the visa application.

ben_w · on Sept 15, 2021

The EU — which doesn’t include all European nations — has 24 official languages and 3 official character sets. The Danish name for the EU contains an æ (“Den Europæiske Union”). Prior heads of state of current members include Μακάριος Γ΄, with the gamma translating as “III” (i.e. “the third”). The UK has a very relaxed attitude to name changing, hence the story of Mr. Yellow-Rat Foxysquirrel Fairydiddle.

Unless Musk tries to name future kids in Emoji, European nations can probably already cope with any of this sort of thing.

sva_ · on Sept 15, 2021

I think single-letter names were already in thing in the US before that. X Musk is probably fairly recognizable.

Aeolun · on Sept 15, 2021

What compels people to do stuff like this? And what wife would possibly agree to this insanity?

pizza234 · on Sept 15, 2021

Narcissism; the child's name is part of his show.

Good question about the partner; I guess that a narcissist can pair well with a very passive person (or maybe another narcissist).

wccrawford · on Sept 15, 2021

There's actually an increasingly common naming pattern that I think is very weird, but in the end, I just let them do whatever. In the grand scheme of things, having a unique name can really help you stand out, and they can always go by a nickname. It doesn't hurt anything, and it could very well help.

anchpop · on Sept 15, 2021

he has a bunch of children with normal names too. I wonder if this was Grimes’ idea.

greymalik · on Sept 15, 2021

Grimes.

speedgoose · on Sept 15, 2021

Some people are eccentric. Elon Musk is the husband of Claire Boucher, a musician.

oblio · on Sept 15, 2021

Do you know that joke about crazy people being regular people and eccentric people being rich?

JasonFruit · on Sept 15, 2021

I'm glad California regulators are focusing on the critical issue of what characters may appear in children's names. Does it ever occur to people in government that some things may not be their problem?

scbrg · on Sept 15, 2021

That's definitely a reasonable thing for governments to care about, since governments generally need to keep track of people's names in some way or other.

It's probably a good idea if government computers can keep track of the name, so a minimum requirement would probably be "can be represented by unicode glyphs". So attempts like Prince's should probably be disqualified. And even in the unicode set there are characters that may be problematic - Record Separators, Zero Width Space, Pile of Poo emoji springs to mind as examples, even if the later one might be doable. It's hard to address a letter to a person whose name just consists of a mix of different whitespace characters (especially when their neighbor is a different mix whitespace characters). So, yes, government probably should care about this, at least a little bit.

That said, Æ should probably be allowed. It's a standard character in several living languages.

CaptainZapp · on Sept 15, 2021

I totally agree with your take, except maybe:

> That said, Æ should probably be allowed

This may cause all sorts of issues, from typing a letter to issuing a passport.

Else than that, exactly what I was thinking.

scbrg · on Sept 15, 2021

The reason I think that maybe Æ should qualify is that it's not entirely unlikely to occur in immigrants' names anyway (or at least the lower case version). There are systems for transforming such characters to something that appears in passports - my own name has an umlaut, and I've obviously got a passport. I'm not American, so my passport is of course not issued in USA, but I suspect few countries would see a name diversity as great as USA so it can't be a new problem for the authorities there.

klyrs · on Sept 15, 2021

Does it occur to you that changing a database schema across an entire state government might incur significant costs and take multiple years to implement? Every courthouse, DMV, hospital, etc. might be running different, decades-old proprietary software, and who knows if the disparate, original contractors are still around.

panzagl · on Sept 15, 2021

As long as you don't expect it to be the government's problem when they can't get a SSN, driver's license, etc.

JasonFruit · on Sept 15, 2021

Why is it my problem to make my children conform to government tracking and enumeration, rather than their problem that my children are hard for them to easily assimilate? Government exists for my children, not my children for government. Their technical problems are their problems, not mine.

And practically, in a world where people cross borders, where people come to this country for refuge and opportunity, how does it make sense to force them all to have only basic English characters in their names?

michaelt · on Sept 15, 2021

> Their technical problems are their problems, not mine.

You’ve never interacted with the government or a large corporation, have you?

JasonFruit · on Sept 15, 2021

I'm not asking what is, but why it should be that way. Why should I name my children for the government's convenience? Why are Jürgen, Hafþór, Renée, or Noël unacceptable names?

IIsi50MHz · on Sept 16, 2021

The government(s) will be happy to receive & evaluate your generous donation of upgrades to the myriad existing record keeping systems, and further generous donations of your time to train all staff on how to enter and search for the additional written characters, so long as the process does not significantly disrupt operations. "Patches welcome!" (^o^)/

But in seriousness: it is their concern because changes that seem minor can require major changes to electronic systems, and allowing special cases that necessitate lookup in ways the electronic systems can't handle (perhaps even requring manual search through hardcopy!) really throw a star-mangled spanner in the works.

Naming is already a special case problem in Japan, where people are accustomed to having no idea how to pronounce most people's names because the parents used non-standard readings of the kanji characters and/or used kanji from a special exempted list of archaic kanji that almost nobody can read. If you're wondering why the government doesn't require people use only the official "common use" kanji, at least just for this one problem, then I should tell it has been tried — enough people raised a fuss about being unable to register their child's "perfect name" that gradually a list of "allowed only in names" kanji was created and expanded.

And if I recall correctly, when registering a name, you can specify a totally unrelated pronunciation using the simpler non-kanji phonetic characters, so even just "common use" kanji are almost "all bets off". A relative few kanji have such common pronunciation in names that they can 'usually' be guessed.

And the problem is the same with many Japanese place-names, having little or no correlation between written form and pronunciation or meaning.

So, why is the spelling of a chosen name any concern to a government? It gets crazy out there, in Name Land. How mäný variatǐons of spelliñg cån ße ællowèd before people give up on pronouncing it?

JasonFruit · on Sept 16, 2021

Why is everyone trying to convince me that handling names is hard for computer systems? That's obvious to everyone, even me.

What's fascinating to me is that none of you consider that government might not need to have a list of everyone it governs, or that such a list might not need to be centralized or computerized. When faced with a facet of humanity that's too complex to be easily reduced to consistent data, private groups either do their best and work with what data they can extract, or else have people handle it, with our flexible, tolerant minds. When government is involved, though, the immediate response is, "We have to force people to be less complex!"

I'm beginning to see technocracy as the biggest threat to a diverse, human, and humane society.

crysin · on Sept 15, 2021

This is actually probably good. My wife has a hyphen in her first name and the amount of government systems including Social Security itself will most of the time result in some validation error when the hyphen is included and it even adds confusion for employees when in an office and their ancient software doesn't even result in an error, it'll just straight up not work and not give them any context on what could be wrong.

nkingsy · on Sept 15, 2021

My wife and I changed our last names in California when we got married.

There are quite specific rules:

The new name must be in the format {substring of original name 1}{?-}{substring of original name 2}, or you have to go through the much more arduous and expensive full name change process (though you get a 2 for 1 discount).

frenchy · on Sept 15, 2021

Can't you also just take the surname of your partner? In Canada that's the most common practice.

wnoise · on Sept 15, 2021

That follows the template he provided

IncRnd · on Sept 15, 2021

> as Æ is not a letter in the modern English alphabet.

That specious, meaning appearing true but actually false. It is a diphthong expressed by a ligature of two letters. The claim was never made that a diphthong is a single letter, and it is easily expressed in modern English, in a computer using ascii and outside on a piece of paper.

frenchy · on Sept 15, 2021

In latin, æ or ae was a (di)graph for the /ae/ diphthong. In English, it is typically pronounced using the /i/ phoneme, which isn't a diphthong. I have heard it pronounced as /ai/ dipthong, but only rarely.

In English, I'm pretty sure the ae digraph is always pronounced as /i/, so calling it a dipthong would probably confuse most people.

IncRnd · on Sept 15, 2021

Agreed. That is a very good point about it possibly being just a digraph and not always a diphthong. This depends, however, on the use, since it does sometimes slide and other times is just a ligature. The issues seem to stem from the pronunciation changes and the great vowel shift. As you point out, language is a messy business.

SippinLean · on Sept 15, 2021

A ligature is a character composed of two or more graphemes, it is not a letter

IncRnd · on Sept 15, 2021

You're correct about that, sort of but not in the sense of language - only for computer definitions. A ligature in language is "a printed or written character (such as æ or ﬀ) consisting of two or more letters or characters joined together" [1]

[1] merriam-webster.com/dictionary/ligature

kingofpandora · on Sept 15, 2021

.... so in other words, it's not a letter in the modern English alphabet. Seems pretty accurate to me.

createunderrate · on Sept 15, 2021

And hence, not a letter.

bierjunge · on Sept 15, 2021

Yes, don't do that, name the kid "null" and see the world burning.

nonameiguess · on Sept 15, 2021

Need to be careful with that. There was a guy in California a few years back that decided the license plate "NULL" would be a fun joke, and he ended up being charged every ticket issued in the state where the license plate wasn't entered.

hunter2_ · on Sept 15, 2021

I recall pondering the mechanics of this back when I first read about it. Is some software actually replacing an input of "" (empty string) with "NULL"? Or is some comparison so loosely typed that a value of type null is considered equal enough to the string "NULL"?

shrikant · on Sept 15, 2021

Maybe they were using an Oracle database, where (shockingly) empty strings are treated as equivalent to NULLs.

Edited to add: A "famous" wtf-worthy explainer from Oracle: https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUEST...

Relevant snippet:

  '' when assigned to a char(1) becomes ' ' (char types are blank padded strings).

  '' when assigned to a varchar2(1) becomes '' which is a zero length string and a zero length string is NULL in Oracle (it is no long '')

hunter2_ · on Sept 15, 2021

The conflation of empty string with a real null isn't great, but it doesn't also imply conflation of the string "NULL" which is what I'm trying to figure out. The four-character sequence shouldn't ever be considered something other than a four-character string, except for particular non-user-facing situations such as when actually writing code. Even spreadsheet software, which does all kinds of heuristics to find numbers in strings for example, doesn't treat the string "NULL" as anything other than a four-character string, to my knowledge.

occamrazor · on Sept 15, 2021

Somewhere some system depends on CSV files in Lotus 1-2-3 format transferred over a non 8-bit safe proprietary UUCP variant.

nradov · on Sept 15, 2021

A similar problem happened to a man with "NO PLATE".

https://www.latimes.com/archives/la-xpm-1986-06-23-vw-20054-...

dolmen · on Sept 15, 2021

https://www.wired.com/story/null-license-plate-landed-one-ha...

belter · on Sept 15, 2021

A common name...

https://www.facebook.com/public/James-Null

as well as Abcde

https://www.facebook.com/public/Abcde

samhw · on Sept 15, 2021

Relatedly, there's the classic of this woman with the surname 'True', whose iCloud didn't work as a result: https://appleinsider.com/articles/21/03/06/coding-error-lock...

yellow_lead · on Sept 15, 2021

See your kid have trouble getting any official documentation :/

tomalpha · on Sept 15, 2021

Or booking flights, filing taxes, signing up for shifts at work: https://www.bbc.com/future/article/20160325-the-names-that-b...

rPlayer6554 · on Sept 15, 2021

Poor Bobby tables....

https://xkcd.com/327/

zodiakzz · on Sept 15, 2021

The word they're looking for is "file extension name".

qwerty456127 · on Sept 15, 2021

There is no such thing as a file extension name, it's a file name extension. It's called this way because file names could only be 8-byte strings initially and then this was extended with 3 extra byte places.

Neither part of the MIME type format even has to match any existing (commonly used) file name extension anyway. E.g. it can be `text/plain`. Even when it does it is just a coincidence (although very common), it actually references the format name (IIRC `image/jpeg` was used even when almost nobody were using `jpeg` for the extension and the convention was to use `jpg`).

kps · on Sept 15, 2021

> * file names could only be 8-byte strings initially and then this was extended*

At least part of that is not true. The ‘popular’ MS-DOS 8.3 form derives via CP/M from DEC OSes which used 6 character file names and 3 character file types, due to their use of RAD50¹ to fit 3 characters in a 16- or 18-bit word. The type field always existed, so the word ‘extension’ most likely refers to its presentation on the end of the file name, rather than an addition to a previous format.

¹ https://en.wikipedia.org/wiki/DEC_RADIX_50

pastage · on Sept 15, 2021

This is interesting because compounds are mostly written together in other languages.

Never thought about it but file extension name really is a word. Someone replied saying this is three words, but it is not is it? It's an open compound word or maybe a "set phrase", I wanted to call it an idiomatic expression but that was clearly wrong.

hunter2_ · on Sept 15, 2021

The Wikipedia articles for "Set phrase" [0] and "Compound (linguistics)" [1] actually don't offer much of a distinction, so it's hard to say which of those is correct. Regardless, a compound with spaces ("open," as you said) is multiple words, not one word.

[0] https://en.m.wikipedia.org/wiki/Set_phrase

[1] https://en.m.wikipedia.org/wiki/Compound_(linguistics)

iamtedd · on Sept 15, 2021

Who are you, Kath Day-Night?

iamtedd · on Sept 17, 2021

I wanted to find an example from Kath & Kim, but couldn't find any on Youtube that wasn't just the "Look at moi" set-up to the actual joke. I'm sorry.

pwdisswordfish8 · on Sept 15, 2021

> I'm a little confused about the issue description

Not helped by the fact that even with explicit cajoling* to give steps to reproduce, the reporter wrote:

> Steps to reproduce

>Try to login/create an user in Gitlab (on-premises/Gitlab.com) where the username ends with a MIME type format

Goddamit, those are not steps to reproduce! Has the current era of "social coding" and its terrible software development practices completely turned people's brains to mush?

(In fact, such cajoling shouldn't even be required. If you don't know you need to provide STR without someone going to the lengths of coddling you by going out of their way to create a template, you don't have any business using a bug tracker.)

svrtknst · on Sept 15, 2021

ok boomer

IIsi50MHz · on Sept 16, 2021

The provided step-to-reproduce (not steps, since there's only one) seems to be regarded by the reporter as an automic operation (indivisble; a single step).

The other person is, I think, trying to express that it can potentially be done more than one way, so additional steps are required.

From a QA perspective, I prefer not to guess what the reporter might have intended. It's much better to have tons of detail, but I can sympathise with being the user and _thinking_ you have been totally clear and am know I've sometimes done this myself.

(( Unrelatedly, "ok boomer"? Since that seems such a non sequitur, I'll take in another unrelated direction and raise you an "ok athena" and "waddaya hear, starkuck?". ))

pwdisswordfish8 · on Sept 16, 2021

You have been trolled. The entire point of saying "ok boomer" here is to, as economically as possible, be dismissive while getting the other side to spend disproportionate energy defending the the thing being dismissed.

> From a QA perspective, I prefer not to guess what the reporter might have intended

There's not really any other perspective. That thing which you say you "prefer" is much more than a preference. It is the only reason why the bug tracker is configured to even ask for STR. If you have a "no solicitors" sign on your front door and a solicitor comes knocking anyway, or you have your door locked and a burglar climbs through a broken window and and takes all your stuff, you wouldn't respond by saying, "from my perspective, it's really disruptive and time-consuming to deal with solicitors who ignore the sign" or, "try to understand that from my perspective, it's really inconvenient when you take my things, because I have to work and spend money to replace them when I could have used that time and money doing something else." To do so tacitly legitimizes an illegitimate position held by the other.

Please look up "DARVO" to understand why explaining yourself like this is a bad idea.

pwdisswordfish8 · on Sept 15, 2021

What, did that make you mad? Why're you mad?

sytse · on Sept 15, 2021

This is a very confusing issue description which is caused by a confusing error message. This comment https://news.ycombinator.com/item?id=28540665 does a great job explaining the context. TLDR; you can't have a username end in .filetype because it might cause the user profile page to not load. The limitation is _not_ related to injection attack prevention, that would be concerning (bobby tables xkcd https://xkcd.com/327/ ).

duskwuff · on Sept 15, 2021

And it was exacerbated by another bug which was causing the absence of a period to be ignored, so any username ending in a recognized filetype was blocked (e.g. "AsiMOV" in the example, or "MaasTIFF" in the comments).

I initially suspected that a regex was involved and someone forgot to escape the period, but it looks like that wasn't even the case -- the erroneous code was literally checking if the username ended in any recognized extension.

https://gitlab.com/gitlab-org/gitlab/-/merge_requests/65954/...

a-dub · on Sept 15, 2021

it seems pretty obvious that they mean any file extension registered to a known MIME type.

gpvos · on Sept 15, 2021

It's easy, basic, and very important for clarity to use the correct terminology in this case. A MIME type is really something different.

a-dub · on Sept 15, 2021

"ERROR: The Gitlab server has rejected the proposed username, as it ends in the same suffix as a file extension that is registered to a MIME type in the Ruby runtime under which the server runs. This list is quite long, and possibly difficult to retrieve, so we will not list it here, but you can find a list of extensions commonly used if you Google for MIME. Alternatively, if this makes no sense at all, ask a local alpha geek and they should be able to help. We understand this is weird, but the reasons for doing so are currently embargoed as they potentially have wide ranging security or stability consequences for a large number of installs. If you wouldn't mind, please do us a small one and keep quiet while we have a chance to prepare and distribute a patch without forcing anyone to forego any nights of sleep 18 months into a global pandemic complete with associated societal fracturing and potential economic collapse. Thank you for your cooperation on this easy, basic and very important matter."

fixed it!

p49k · on Sept 15, 2021

It’s unrealistic to expect everyone to be able to know and use terminology perfectly. The description itself was well-written and that’s what’s important in terms of finding/fixing the issue.

DonHopkins · on Sept 15, 2021

It's foolish to use a standard well defined very precise but incorrect term of art like like "MIME Type" that is actively misleading to a technical audience of github users who are more likely than most people to know the standard official definition, and which is LESS commonly known than a correct widely understood vernacular term like "file name extension", if your goal is to be understood by everyone.

I'd rather a technically oriented site like github be "unrealistic" and correct and instructive, than foolish and wrong and misleading. Are you really saying it's better to use the wrong term because github users might not understand the more widely known correct term?

Just how does leading the user on a wild goose chase looking up the definition of "MIME Type," causing them to waste their time and misunderstand the error message, when it's really a file name extension (a term which more people understand anyway), help the user achieve their goals?

The bottom line is that github disallowing "MIME Types" or "file name extensions" in user names, just like a bank disallowing "select" and "drop" and "from" and "null" and "delete" and "bobby" and "tables" in passwords, is a symptom of a much larger more terrible problem, and whoever wrote that stupid error message instead of fixing the underlying bug that caused it has much worse problems than poor English language writing skills.

carl_dr · on Sept 15, 2021

Muphry's law strikes again!

The bug affected Gitlab, not GitHub.

p49k · on Sept 15, 2021

Sorry, I thought you were referring to the bug reporter rather than the author of the error message itself, in the latter case I agree.

rjmunro · on Sept 15, 2021

The bug reporter merely copies the incorrect error that gitlab gave him; The issue here is that someone working in login / security of gitlab doesn't know what a mime type is. That is extremely worrying - it's not a part of the code where you can afford to be sloppy.

It's a very odd error. Apparently .nro is a file extension used by the Nintendo Switch video game console; .o is obviously the output of compilers, so I'm not sure why my username wasn't rejected. Maybe it would be if I tried to register now.

jfrunyon · on Sept 15, 2021

No file extension <-> MIME type "registry" exists. File extensions exist completely outside of MIME. Many file extensions do not correspond to a registered MIME type, or in many cases even a de-facto one (other than application/octet-stream or text/plain).

It seems pretty obvious - based on the failed username in question, and to someone with fairly deep technical knowledge - that they mean anything they consider a file extension. Which is not an excuse for this marvel of awful UI slapped on top of a poorly-thought-out workaround (for some unknown vuln (that's been patched for over 2 months and is still private? quite strange for an "open" company eh?).

defanor · on Sept 15, 2021

FWIW, the IANA media type registry [1] lists "File extension(s)" under "Additional information" for some of the media types, so it may make sense to speak of filename extensions associated with registered media types. Though given the initial odd wording, it could be anything.

Edit: as for what's actually used, looks like [2] it's the ruby mime-types gem [3], which is based on both IANA registry and various other recommendations [4], AIUI.

[1] https://www.iana.org/assignments/media-types/media-types.xht...

[2] https://gitlab.com/gitlab-org/gitlab/-/merge_requests/65954/...

[3] https://rubygems.org/gems/mime-types

[4] https://github.com/mime-types/mime-types-data

jfrunyon · on Sept 16, 2021

Uhh, I'm not sure where you see the word "extension" on your [1], but I don't.

Registrants can include info like file extensions with their registration, but those extensions are not registered (not by IANA, anyway).

defanor · on Sept 16, 2021

That "[1]" was for the registry, the extensions are listed for particular media types, such as application/xml [1]. Also mentioned in RFC 6838 section 4.12 [2].

And indeed, what IANA registers there is media types, so I've only mentioned associations. I think it still works for a-dub's argument, that it's obvious (though I'd word it differently, such as it being a reasonable guess) that if "MIME type format" is mentioned and "mov" is shown as an example, what's actually meant is filename extensions associated with registered and/or otherwise known media types (which turned out to be the case).

[1] https://www.iana.org/assignments/media-types/application/xml

[2] https://www.rfc-editor.org/rfc/rfc6838.html#section-4.12

jfrunyon · on Sept 17, 2021

But "mov" is not shown as an example. The particular requestor in this issue wanted to use mov, however no example was shown in the error message. Moreover even in the requestor's case the user has to already be savvy enough to guess that "mov" is the part of "isaac.asimov" that it had a problem with, which is not at all obvious.

I don't know exactly what list GitLab is using (because, as has been mentioned, there's no central registry of file extensions), but there are many extensions more obscure and less obvious than even mov, which would further obfuscate the actual problem.

yuliyp · on Sept 15, 2021

Officially no such registry exists. In practice, Apache does have such a registry by default: https://svn.apache.org/repos/asf/httpd/httpd/trunk/docs/conf... and other systems do use that mapping or a similar one.

jfrunyon · on Sept 16, 2021

> File extensions exist completely outside of MIME.

Yes. I am well aware that certain software, such as Windows, MimeMagic, or Apache, do include their own lists. That is not a "registry" and in fact you will likely find that every such list is either identical to, a fork of, or incompatible with every other such list.

yuliyp · on Sept 17, 2021

I agree with you, I think?

Whether you call such a list a registry or not is fairly trivial, I think. Their web server uses such a mapping, and for some reason usernames with suffixes on that list confused something. Call it a registry, or a mapping, or a list, the meaning of the statement is the same, and I don't think it prevents most people from understanding what's going on.

jfrunyon · on Sept 17, 2021

registry (noun): a place or office where registers or records are kept; an official list or register.

In this case, a registry would be a mapping, however a mapping is not a registry. A registry is the single, centralized place from which you can look up registrations, to guarantee that there are no conflicting records. Literally none of that is (or can be, at this point) true of file extensions.

hnlmorg · on Sept 15, 2021

In fairness, it is Gitlabs wording the issue reporter is using. Check the error message.

Grollicus · on Sept 15, 2021

Gitlab recently exchanged the "WIP" prefix for merge requests (Work in Progress = started to do something but didn't complete it yet) for "Draft", which has connotations of throwing the draft/sketch away to build the final product.

Which is definitively not what is meant there. But I think it shows that Gitlab is not a company I'd go to if I wanted linguistic precision.

gls2ro · on Sept 15, 2021

I am not a native English speaker (and I don't work at Gitlab) but I am curious and interested in using proper terms, as I think as developer that naming is a very important skill.

So, I find strange this meaning that you give for a "draft" = that it is something that has connotation of throwing away when building the final product.

I think you are confusing a "draft" with a "sketch" and they are not the same.

I googled the term "draft" and here is what I found:

> "a version of something (such as a document) that you make before you make the final version" [1]

> "A preliminary version of a piece of writing." [2]

While "sketch" means:

> "a rough drawing representing the chief features of an object or scene and often made as a preliminary study" [3]

> "A rough or unfinished version of any creative work." [3]

> "A rough or unfinished drawing or painting, often made to assist in making a more finished picture." [4]

In the case of sketch I see some keywords like "preliminary study" or "assist" or "unfinished version" that indicates that the sketch will not be the final product.

So while it is true that a Draft could be thrown away if someone has new/better/difference ideas while working on it, it does not seem to imply that a Draft should be thrown when building the final product.

As far as I understand it is more that a draft will evolve into a final product or might be abandoned.

[1] https://www.merriam-webster.com/dictionary/draft

[2] https://www.lexico.com/definition/draft

[3] https://www.lexico.com/definition/sketch

[4] https://www.merriam-webster.com/dictionary/sketch

d1sxeyes · on Sept 15, 2021

Unfortunately, here, I think the problem is that English is not very clear.

As a native English speaker, the first time I encountered the word 'draft' was at primary school. We used it to describe a piece of writing where presentation was not the focus, instead, content and accuracy in terms of spelling, grammar, and punctuation would be the focus. Once the drafts were complete, we would 'copy these up' in our neatest handwriting.

However, if I prepare a 'draft' of some document or other for my boss, I expect it to be essentially an unapproved version of a final document, perhaps needing some minor modifications before release, but also perhaps not.

Although personally, I would use 'sketch' to mean something disposable which illustrates a more perfect version, my grandmother is an artist, and she refers to the initial drawings she makes on the canvas as 'sketches', which she then paints over in more detail.

Essentially, I don't think there's a big difference between draft and sketch - both could (in my opinion) represent either a version which will be discarded or which will be developed further.

Overall, I think here, the 'Work In Progress' label is the clearest and least likely to be interpreted differently by different users.

avianlyric · on Sept 15, 2021

It's also interesting the in English we also have Drafters or draughtsman/draughtswoman, who's job is drafting, which is the process of creating technical drawings for manufacture.[1]

In a highly technical environment this is the kind of work I associate with drafting, but not necessarily with the word draft.

The wikipedia page for "Draft" has a veritable smorgasbord of different things that are considered "Drafts"[2], which kind of illustrates that getting pedantic about the definition of the word is losing proposition.

But a native english speakers as well, I agree with you, that the everyday colloquial definition of the word "draft" is an incomplete piece of work that needs further refinement before it can be considered complete or final.

[1] https://en.wikipedia.org/wiki/Drafter [2] https://en.wikipedia.org/wiki/Draft

DonHopkins · on Sept 15, 2021

But in the days of digital files, unlike the days of typewriters that put ink on paper, it's much easier to edit a draft into the final product.

Grollicus · on Sept 15, 2021

> I am not a native English speaker (and I don't work at Gitlab) but I am curious and interested in using proper terms, as I think as developer that naming is a very important skill.

That is it for me too and why I think this change is so annoying.

WIP means something is being worked on. [0]

Draft can mean the same, but it also has a bunch of other possible meanings. Note that if you search in both of your sources, you'll find "sketch" as an explaination. It can mean the same, but it can also mean a bunch of other things. It's a strictly worse name.

[0] https://en.wiktionary.org/wiki/WIP

capableweb · on Sept 15, 2021

Most likely the rename is in order to get more people to understand what it is without having to look up abbreviations. "Draft" is clear to most people who know English, while "WIP" puts a lot more burden on the reader in terms of what they already need to know in order to understand.

DonHopkins · on Sept 15, 2021

But they can just google WIP at work! ...Until Cardi B comes out with another song about her hoo haa haa with that title.

dmurray · on Sept 15, 2021

Yeah, I think this is fair.

A draft in the visual arts (drawing, painting), is typically abandoned, like a sketch. Perhaps it uses a different medium to the final version, and in any case can't easily be adapted.

A draft in writing is typically incrementally improved, or at least we think of it that way now that we write on computers. We don't need to start again even for substantial changes like adding a new paragraph.

mewpmewp2 · on Sept 15, 2021

In internet, in most places I've seen "draft" as being something that is not yet published. For example in a lot of blogging software like WordPress, you can save your post and it will be "Draft", but only will be visible to others when you publish, so it's not thrown away in most contexts that I know it being used in the Internet.

dmurray · on Sept 15, 2021

Isn't it the same connotation here? The work is shared with your collaborators, but not yet "published" to the master branch or to customers or wherever?

mewpmewp2 · on Sept 15, 2021

Yeah, so in my opinion "draft" works well as a descriptor.

gerdesj · on Sept 15, 2021

Draft is also the depth of water displaced by a boat/ship.

Draught (sounds the same as draft) is the wind through a crack or a type of beer. A door or window might be draughty but a beer wont.

Drought looks similar but sounds like "drowt" and is what you get when there is a long period of time without rain.

Be careful with draft, draught and drought!

capableweb · on Sept 15, 2021

> which has connotations of throwing the draft/sketch away to build the final product

I think it's only software developers who urge others to first write a draft, then throw it away and start working on the real thing. Usually, a draft precedes the "real" version and it's a status attached to something. Eventually, a "draft" becomes "published" or something similar.

Authors, scientists, email writes, report creators, movie/music producers all create drafts that (maybe) eventually become the real thing, I don't think many of them throw away the draft but rather work on the draft until it's not a draft anymore.

justinclift · on Sept 15, 2021

> ... which has connotations of throwing the draft/sketch away to build the final product.

Interesting. What industry does it that way?

eurasiantiger · on Sept 15, 2021

GitHub is using ”draft”, maybe that is the reason.

hibbelig · on Sept 15, 2021

> it seems pretty obvious that they mean any file extension registered to a known MIME type.

I guess I'm dense, but I actually thought it's about users such as Mr. Joe R Text/Plain. When I read about "mov" in the actual issue, then it became clear.

DonHopkins · on Sept 15, 2021

Since file extensions can be any three or fewer (or even more) valid characters, then no string ending with zero to three characters, in other words, no user names are valid.

AlfeG · on Sept 15, 2021

I guess author of ticket were referring to internal type name mime_type

id5j1ynz · on Sept 15, 2021

This seems on par with the general GitLab-style. Is anybody else getting a bit frustrated with them?

They keep on having high-severity security bugs being fixed every month (e.g. auth checks not being done everywhere). Then there's all these odd edge case bugs everywhere.

As an outsider, it just seems to me that GitLab isn't being engineered in a principled way: on sound abstractions and with separation of concerns (e.g. auth should be some universal middleware, not ad-hoc per call). Just really basic stuff.

john_cogs · on Sept 15, 2021

GitLab team member here. I'd like to add some additional context to my previous comments [1][2].

Due to a security concern in which a profile containing a file extension would not load [3], we do not allow usernames that end with file extensions (ex: .mov). As noted by many folks here, these are associated with a MIME type but are not MIME types themselves. It is not related to preventing an injection or any such attack vector.

The error message for this check incorrectly included MIME type rather than file extension. This has been updated [4].

Additionally, there was an issue with how the actual check as it did not include the leading dot. The leading dot was added to the check in a subsequent MR [5].

Thanks for all the feedback.

1 - https://news.ycombinator.com/item?id=28535739

2 - https://news.ycombinator.com/item?id=28538166

3 - https://gitlab.com/gitlab-org/gitlab/-/issues/26295

4 - https://gitlab.com/gitlab-org/gitlab/-/merge_requests/70374/...

5 - https://gitlab.com/gitlab-org/gitlab/-/merge_requests/65954

Arnavion · on Sept 15, 2021

>Due to a security concern in which a profile containing a file extension would not load [3], we do not allow usernames that end with file extensions (ex: .mov).

Why did you not fix your routing engine to not consider file extensions where the username / group name should go?

bogwog · on Sept 15, 2021

gitlab profile URLs are `gitlab.com/<username>`, so a user with the name "dashboard.html" would have the URL `gitlab.com/dashboard.html`, which obviously conflicts with the existing dashboard.html file.

Besides blacklisting certain usernames or breaking a bunch links to profiles, how would you fix that?

EDIT: IIRC, github has the same issue, but they have profiles as lower priority. So if your username conflicts with an existing URL, your profile page doesn't work.

smallbizdev420 · on Sept 15, 2021

Isn't the problem arising because GitLabs files are in the global namespace? If the user is the namespace for all their files, and GitLab files were under a Gitlab user, this wouldn't be a problem. Under the current implementation, every time you add a file, you have to make sure its name doesn't conflict with an existing profile. And a username has to avoid conflict with all present and future filenames. Mutual pain doesn't seem like a good way forward.

jrochkind1 · on Sept 15, 2021

I can think of a few possible ways to fix it.

1. deny-list only usernames that are actually existing conflicts

2. Change the URL for only usernames that have conflicts, to `https://gitlab.com/u/<username>`.

3. Change the URL for _all_ usernames to `gitlab.com/u/<username>` as this collision points out the flaw in the original URL design in the first place, because of possible collisions. 301 redirects could of course be used for any non-colliding usernames.

I am now wondering how _github_ takes care of it though. Github also has `github.com/<username>` urls. What does it do with collisions? Github pages don't even all end in `.html` or contain a `.` at all, so gitlab's particular solution would not work. For instance, there is a page `https://github.com/topics`. What happens if you try to create a github user called "topics"?

If I try to create one, it says "Username 'topics' is unavailable." Same for say `marketplace` or `trending`. Perhaps they've deny-listed only actually-existing github urls? That does seem tricky, whenever they want to create a new top-level /page on github, they can only do it if there isn't already a github account with that name?

But if as someone else says `/dashboard.html` is just a weird non-canonical alternate for `/dashboard`, which already had to be reserved, maybe gitlab is already doing (1) anyway? Then why do they need to also deny any username with ending in any valid extension? Unclear.

It still makes me wonder if they have a routing precedence problem, which they worked around by just forbidding any username that triggered it, instead of fixing the actual issue.

jhugo · on Sept 15, 2021

In what situation would someone be requesting `https://gitlab.com/dashboard.html`? When I go there, I get the exact same page as I get at `https://gitlab.com/`, why was it necessary to support both URLs? Now they're stuck with it of course, if anyone actually uses /dashboard.html, but surely they could just special-case filenames that actually exist, just like they presumably special-case URLs they use like /help already. It doesn't seem necessary to blanket-ban anything with a file extension.

remram · on Sept 15, 2021

Or gitlab.com/dashboard. GitLab, GitHub, etc already have a need to reserve specific usernames (like `org`, `settings`, `projects`, `new`, `explore`, `marketplace`, `help`, ...). Since you already have to blocklist specific names not containing extensions, I really don't see how banning extensions help them.

Hopefully we'll know more once their security ticket becomes public.

Arnavion · on Sept 15, 2021

For dashboard.html, sure. Fixing that requires making a breaking change to URLs.

My comment, and the issue that was submitted here, is about *.mov

boleary-gl · on Sept 15, 2021

GitHub doesn't allow a "." or really an special characters besides "-" and "_" in usernames

symlinkk · on Sept 15, 2021

Band aids on top of band aids. Respect for being honest and open about it though

stephenr · on Sept 15, 2021

I prefer the term “lipstick on a pig”.

bob1029 · on Sept 15, 2021

Why should the username matter? In my systems, I could have an insane URL like...

  https://myservice.com/my.super.duper.crazy-ass.user.name.pdf.exe

and still have it return a proper HTML document that covers that user's profile page. Hell, the username could be some insane zalgo-tier shit and still function properly.

I see some comments defending arbitrary "bandaid" architecture and I think that this is not defensible for something the scale of GitLab. This is basic HTTP stuff.

boleary-gl · on Sept 15, 2021

Of note, GitHub doesn't allow periods in usernames either. I'm not a Ruby expert but I wonder if file extensions give some specific Ruby gotchas that means both GitLab and GitHub operate this way.

gargron · on Sept 15, 2021

It's not a Ruby gotcha, it's a Rails routing gotcha. You can specify alternative formats on any path, e.g. you can access /path to get HTML (or whatever the "default" format for that controller method is) or you can access /path.json or /path.xml and if the controller method specifies handlers for those formats, you get that format. So if you allow a username like "john.doe" and the route is something like /john.doe then Rails will interpret the "john" as the ID part of the path and "doe" as the format part of the path. You can override this in your routes to support periods but then you do lose the capability of accessing alternative formats which sometimes can be useful.

dnsmichi · on Sept 15, 2021

Thanks. Now I also get a better understanding of why .patch for MRs or .keys for user names as file extension work on both Rails platforms. I always found these file extension hacks very useful for quick access and automation.

Examples:

https://gitlab.com/gitlab-org/gitlab/-/merge_requests/70427....

https://gitlab.com/dnsmichi.keys

e12e · on Sept 15, 2021

Ruby? No. Rails? Perhaps. (Both github and gitlab are built with ruby on rails).

For an old post on a somewhat related topic, see:

https://ryanbigg.com/2009/04/how-rails-works-2-mime-types-re...

I could imagine the mix of rails #respond_to and "file extensions" at the end of urls might make a mess (think /users/profile/smith.html vs /smith vs /smith.json vs /smith.txt - essentially what might have been /smith?format=json etc).

Ed: current documentation: https://apidock.com/rails/v6.1.3.1/ActionController/MimeResp...

boleary-gl · on Sept 15, 2021

I meant to say Rails.

Note to self: Never say Ruby when you mean Rails on HN

thrwn_frthr_awy · on Sept 15, 2021

Some apps like to allow usernames to be used as sub domains and so periods are not allowed.

dnsmichi · on Sept 15, 2021

Hi, how would you address this in GitLab's code? Maybe you'd like to create a merge request with suggested fixes :)

https://news.ycombinator.com/item?id=28540665 has all URLs and issues available to get started in the code.

vultour · on Sept 15, 2021

It’s not his job to fix a paid product.

bob1029 · on Sept 15, 2021

I would begin by familiarizing myself with the Content-Type HTTP header:

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Co...

As for actually performing this change myself, it would take me quite some time to grok the codebase.

Maybe I get bored tonight and see how hard this would be to resolve.

pechay · on Sept 15, 2021

Once had my Australian production system go down because a js plugin we were using had added .au to its list of media types. Integration had a different TLD. Response I got from the author was 'LOL!'. :)

CodesInChaos · on Sept 15, 2021

Why is the application treating a TLD like a file extension?

detaro · on Sept 15, 2021

probably because some broken code matched on the end of the URL, without checking if it was just the naked domain?

hinkley · on Sept 15, 2021

Every programming language has a library to parse urls.

Please, for the love of your own mothers, people: stop pretending like you know how to parse urls like they're strings. You don't. And even if you can, you won't do it right every time.

It's been demonstrated over and over again.

Just use the already tested url facilities to give you the host/path/query parameters

cmpb · on Sept 15, 2021

I'm going to have to find a way to work "naked domain" into my lexicon today, thanks!

detaro · on Sept 15, 2021

On second thought, I actually used it wrong above: usually, naked domain specifically refers to the registered domain, i.e. without any subdomains.

quesera · on Sept 15, 2021

Yet another reason Australia should have kept TLD .oz!

ncann · on Sept 15, 2021

Looks like the fix restricted the check to "usernames that end with dot and MIME type"

Still, what is the attack vector?

anentropic · on Sept 15, 2021

Probably: some web frameworks do content negotiation by appending a content type like .json to the end of the url

Not sure if it's an attack vector per se, or just that the behaviour is incompatible with allowing usernames containing . and then having urls where the username is the last segment of the url

seems like a badly designed url scheme :)

jfrunyon · on Sept 15, 2021

> some web frameworks do content negotiation by appending a content type like .json to the end of the url

This has always disturbed me, considering that HTTP has had content negotiation for ... oh, basically its entire history [https://www.w3.org/Protocols/HTTP/1.0/spec.html#Accept].

Macha · on Sept 15, 2021

For non-programmatic usage and verification, the extensions can be easier?

On a similar topic HIBP allowed people to request versioning via a custom HTTP Header, a Accept Content-Type, or a version segment in the URL path and approximately everyone went with option 3.

jfrunyon · on Sept 16, 2021

"For non-programmatic usage and verification," it should be served in a human-readable manner, typically including things like formatting, and maybe even syntax coloring if you're feeling nice.

Take my `Accept: text/html` and give me my HTML-ified JSON, dangit! ;)

Macha · on Sept 16, 2021

I've actually seen APIs with "pretty=true" or "indent=4" type query parameters to emit formatted JSON for people before, though it doesn't go as far as HTML with syntax colouring.

CoffeeOnWrite · on Sept 15, 2021

Yes, the content negotiation you describe is a very longstanding default behavior of Rails. It should probably be made opt-in rather than opt-out in the next major version.

walty8 · on Sept 15, 2021

But the in the screen capture of article, the user name is actually 'issac.asimov', i.e. the mime type does not immediately follow the dot.

nobody9999 · on Sept 15, 2021

>But the in the screen capture of article, the user name is actually 'issac.asimov', i.e. the mime type does not immediately follow the dot.

A variation on the Scunthorpe Problem[0] then, eh?

[0] https://en.wikipedia.org/wiki/Scunthorpe_problem

whizzter · on Sept 15, 2021

Somebody probably put in a regexp with .mov$ , however for regexps the dot (.) matches everything (and $ matches end) so the i in asimov is eaten regardless and then the rest of the match succeeds.

chippiewill · on Sept 15, 2021

You can see the fix they made in the linked MR.

It wasn't a regex, they just did a generic "ends with" check.

iechoz6H · on Sept 15, 2021

Perhaps the sub-clause is redundant there?

'The problem was named after an incident in 1996 in which AOL's profanity filter prevented residents of the town of Scunthorpe, Lincolnshire, England, from creating accounts with AOL, because the town's name contains the substring "cunt".'

nobody9999 · on Sept 15, 2021

>'The problem was named after an incident in 1996 in which AOL's profanity filter prevented residents of the town of Scunthorpe, Lincolnshire, England, from creating accounts with AOL, because the town's name contains the substring "cunt".'

Right. Regardless of the specific pattern matching function, in both cases, the results were both incorrect and unwanted. Which is why I consider this instance to be a variation on the same issue.

ajkjk · on Sept 15, 2021

That was before the fix.

boomskats · on Sept 15, 2021

This doesn't look like a security issue, unless I'm missing something.

paxys · on Sept 15, 2021

Definitely a security issue.

- The merge request which originally added this check is inaccessible (https://gitlab.com/gitlab-org/security/gitlab/-/merge_reques...)

- In the issue comments the Gitlab employee says "Sorry, I cannot go into details right now. I will link the issue here once it goes public, is it ok?"

nine_k · on Sept 15, 2021

It could maybe potentially be exploited in a very interestingly crafted email, where there's link to download something (e.g. the source tarball, or a build artifact) with an URL containing the username, or being otherwise close by, so that the downloaded file would be interpreted differently. But I'm not creative enough at this hour to suggest a working exploit.

dolmen · on Sept 15, 2021

I suspect a case of impersonating a user which doesn't have the suffix. Ex: create user "toto.mov" to takeover some resources of user "toto".

amjd · on Sept 15, 2021

Maybe it's something to do with a MIME sniffing attack. The user profile URL may be detected as a different MIME type by the browser based on the extension: https://gitlab.com/myname.js

I'm not sure how one could exploit it though...