-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Null bytes are handled inconsistently #3110
Comments
Meanwhile,
|
|
jq 1.6 is an old version; I tried your example and I get a parse error: $ printf '1\r\x00\n\x00\n1\n\x00 \x00' | jq
1
jq: parse error: Invalid numeric literal at line 2, column 0 So, if NUL is supposed to be whitespace as you are saying (have not checked), that is wrong; but it does not return |
@SOF3 That is just standard JSON as specified in https://json.org You cannot have literal ASCII control characters (with the exception of DEL U+007f; mentioned in the rfc) in JSON strings. |
But the parser does seem to get confused by NUL when it is used as whitespace in the input: $ printf '1\0 2 ' | jq # stops parsing after NUL
1
$ printf '1\0 2\n' | jq # treats NUL as whitespace
1
2 $ printf '1\r\x00\n\x00\n1\n\x00 \x00' | jq
1
jq: parse error: Invalid numeric literal at line 2, column 0
$ printf '1\x00\n\x00\n1\n\x00 \x00' | jq
1
jq: parse error: Invalid numeric literal at line 3, column 0
$ printf '1\x00\x00\n1\n\x00 \x00' | jq
1
1 |
Describe the bug
A clear and concise description of what the bug is.
Whitespace-delimited NUL bytes are sometimes parsed as zero values but sometimes not.
To Reproduce
Provide a minimal test case to reproduce the behavior.
If the input is large, either attach it as a file, or create a gist and link to it here.
(Btw, U+000D is a valid whitespace character according to RFC 8259, but does not seem to be included in the lexer. I am not familiar with flex so I don't know if there's some magic going on there)
jq/src/lexer.l
Line 133 in ed8f715
Expected behavior
A clear and concise description of what you expected to happen.
To be honest, I don't know what to expect for null bytes, but I would expect them to be something more consistent.
RFC 8259 does not permit NUL bytes as input, so it is reasonable (although probably unnecessary) to treat them, when outside string literals, either as invalid characters or whitespace. But magically creating a Number(0) value does not look right.
Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: