Fix int overflow in cmdbuf number parser #243

kpcyrd · 2022-01-09T15:13:05Z

Checking for an overflow after it has occurred is too late, so the overflow check needs to happen before the calculation.

Reproduce

Compile with -fsanitize=undefined

CFLAGS='-fsanitize=undefined' LDFLAGS='-fsanitize=undefined' sh configure

:18446744073709551616<enter>

Shoutout to @c3h2_ctf

Checking for an overflow after it has occurred is too late, so the overflow check needs to happen before the calculation. Reproduce: Compile with -fsanitize=undefined CFLAGS='-fsanitize=undefined' LDFLAGS='-fsanitize=undefined' sh configure :18446744073709551616<enter>

gwsw · 2022-01-09T18:11:10Z

The problem with this solution is it limits the number to INT_MAX, while POSITION (that is, off_t) may be larger than an int. Is there an actual problem with the functionality of the current code, other than the warning when compiled with -fsanitize=undefined?

stoeckmann · 2022-01-09T20:57:13Z

I quote the wikipedia article to integer overflows regarding the C language:

In the C programming language, overflow of unsigned integer types results in wrapping, however overflow of signed integer types is undefined behavior; consequently a C compiler is free to assume that the programmer has ensured that signed overflow cannot possibly occur and thus it may silently optimise out any check subsequent to the calculation that involves checking the result to detect it without giving the programmer any warning that this has been done.

An aggressively optimizing compiler might remove the current check.

We had a discussion about the INT_MAX limit as well and took the smaller one (64 bit vs 32 bit), assuming that the limit is large enough for realistic use cases. But there are two possibilities.

The first one is the easiest: A preprocessor macro, calling sizeof(LINENUM) == 8 ? INT64_MAX : INT_MAX instead of the fix INT_MAX version.

The second one is a bit more complex but has other advantages as well:

Signed overflows also exist in lstrtoi and lstrtopos. We started to implement checks there as well and created a macro which retrieves the maximum size of LINENUM aka off_t. But we were not sure what the impact of our changes would be, so it would take a bit longer for us to come up with a patch.

So if lstrtopos (or newly added lstrtolinenum) has a proper check, we could remove the duplicated code and call lstrtopos/lstrtolinenum here as well.

gwsw · 2022-01-11T05:45:34Z

I like the second option, fixing the overflows in lstrto* and using lstrtolinenum in cmd_int. cmd_int is somewhat overloaded: the return value is sometimes used as a POSITION and sometimes as a LINENUM as well as as an int, so proper type handling will be a bit complicated.

Reuse fixed lstrto* functions where possible to reduce duplicated code.

stoeckmann · 2022-01-14T20:41:14Z

I have added a commit which address more integer overflows and fixed the lstrto* functions as good as possible. It assumes that "unsigned long long" will be the largest unsigned type. Or at least large enough to cover off_t. If casts reveal that truncation occurs then -1 is returned. Since negative values cannot be valid (minus sign is not supported) the callers can check for such error condition.

The commit contains many changes, unfortunately. At least it is always the same pattern, so review should be manageable. I am not sure if I have manually tested all callers though.

gwsw · 2022-01-23T19:10:46Z

One problem with this solution is the assumption that the compiler supports "long long" (and LLONG_MAX). Less currently does not use the long long type anywhere. I am pretty sure this will cause less to fail to build in some environments in which it currently builds.

We can check for long long during configure call. If it is available then use it for string to integer conversion. Otherwise fall back to long, which is also default behavior if off_t is not available.

stoeckmann · 2022-01-23T21:01:28Z

I have added a check to configure.ac to see if long long is available. If it is, then use long long. Otherwise use long. I guess this is the most portable solution we can have without introducing "arbitrary" limits...

eggert · 2023-01-29T21:00:51Z

Inspired by this PR, I created PR #331 that I think does everything this PR does, and in addition addresses the above comments and fixes some other integer overflows as well, as well as replacing less's only use of floating point with exact integer arithmetic that does not overflow. Although PR #331 doesn't fix all the integer-overflow issues in less I hope it fixes the most likely ones.

gwsw · 2023-01-30T18:19:28Z

Superseded by #331.
@eggert, I made some small style changes and fixed a few places where C99 syntax was used. Let me know if you see any issues with my changes.

eggert · 2023-01-30T18:47:36Z

Thanks, looks good.

Fix more integer overflows

a7851a4

Reuse fixed lstrto* functions where possible to reduce duplicated code.

Explicitly check for long long availability

f019ff2

We can check for long long during configure call. If it is available then use it for string to integer conversion. Otherwise fall back to long, which is also default behavior if off_t is not available.

eggert mentioned this pull request Jan 29, 2023

improved integer overflow fixes #331

Merged

gwsw closed this Jan 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix int overflow in cmdbuf number parser #243

Fix int overflow in cmdbuf number parser #243

kpcyrd commented Jan 9, 2022

gwsw commented Jan 9, 2022

stoeckmann commented Jan 9, 2022

gwsw commented Jan 11, 2022

stoeckmann commented Jan 14, 2022

gwsw commented Jan 23, 2022

stoeckmann commented Jan 23, 2022

eggert commented Jan 29, 2023

gwsw commented Jan 30, 2023

eggert commented Jan 30, 2023

Fix int overflow in cmdbuf number parser #243

Fix int overflow in cmdbuf number parser #243

Conversation

kpcyrd commented Jan 9, 2022

Reproduce

gwsw commented Jan 9, 2022

stoeckmann commented Jan 9, 2022

gwsw commented Jan 11, 2022

stoeckmann commented Jan 14, 2022

gwsw commented Jan 23, 2022

stoeckmann commented Jan 23, 2022

eggert commented Jan 29, 2023

gwsw commented Jan 30, 2023

eggert commented Jan 30, 2023