Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix int overflow in cmdbuf number parser #243

Closed
wants to merge 3 commits into from

Conversation

kpcyrd
Copy link

@kpcyrd kpcyrd commented Jan 9, 2022

Checking for an overflow after it has occurred is too late, so the overflow check needs to happen before the calculation.

Reproduce

Compile with -fsanitize=undefined

CFLAGS='-fsanitize=undefined' LDFLAGS='-fsanitize=undefined' sh configure
:18446744073709551616<enter>

Shoutout to @c3h2_ctf

Checking for an overflow after it has occurred is too late, so the overflow
check needs to happen before the calculation.

Reproduce:

Compile with -fsanitize=undefined

    CFLAGS='-fsanitize=undefined' LDFLAGS='-fsanitize=undefined' sh configure

:18446744073709551616<enter>
@gwsw
Copy link
Owner

gwsw commented Jan 9, 2022

The problem with this solution is it limits the number to INT_MAX, while POSITION (that is, off_t) may be larger than an int. Is there an actual problem with the functionality of the current code, other than the warning when compiled with -fsanitize=undefined?

@stoeckmann
Copy link
Contributor

I quote the wikipedia article to integer overflows regarding the C language:

In the C programming language, overflow of unsigned integer types results in wrapping, however overflow of signed integer types is undefined behavior; consequently a C compiler is free to assume that the programmer has ensured that signed overflow cannot possibly occur and thus it may silently optimise out any check subsequent to the calculation that involves checking the result to detect it without giving the programmer any warning that this has been done.

An aggressively optimizing compiler might remove the current check.

We had a discussion about the INT_MAX limit as well and took the smaller one (64 bit vs 32 bit), assuming that the limit is large enough for realistic use cases. But there are two possibilities.

The first one is the easiest: A preprocessor macro, calling sizeof(LINENUM) == 8 ? INT64_MAX : INT_MAX instead of the fix INT_MAX version.

The second one is a bit more complex but has other advantages as well:

Signed overflows also exist in lstrtoi and lstrtopos. We started to implement checks there as well and created a macro which retrieves the maximum size of LINENUM aka off_t. But we were not sure what the impact of our changes would be, so it would take a bit longer for us to come up with a patch.

So if lstrtopos (or newly added lstrtolinenum) has a proper check, we could remove the duplicated code and call lstrtopos/lstrtolinenum here as well.

@gwsw
Copy link
Owner

gwsw commented Jan 11, 2022

I like the second option, fixing the overflows in lstrto* and using lstrtolinenum in cmd_int. cmd_int is somewhat overloaded: the return value is sometimes used as a POSITION and sometimes as a LINENUM as well as as an int, so proper type handling will be a bit complicated.

Reuse fixed lstrto* functions where possible to reduce duplicated code.
@stoeckmann
Copy link
Contributor

I have added a commit which address more integer overflows and fixed the lstrto* functions as good as possible. It assumes that "unsigned long long" will be the largest unsigned type. Or at least large enough to cover off_t. If casts reveal that truncation occurs then -1 is returned. Since negative values cannot be valid (minus sign is not supported) the callers can check for such error condition.

The commit contains many changes, unfortunately. At least it is always the same pattern, so review should be manageable. I am not sure if I have manually tested all callers though.

@gwsw
Copy link
Owner

gwsw commented Jan 23, 2022

One problem with this solution is the assumption that the compiler supports "long long" (and LLONG_MAX). Less currently does not use the long long type anywhere. I am pretty sure this will cause less to fail to build in some environments in which it currently builds.

We can check for long long during configure call. If it is available
then use it for string to integer conversion. Otherwise fall back to
long, which is also default behavior if off_t is not available.
@stoeckmann
Copy link
Contributor

I have added a check to configure.ac to see if long long is available. If it is, then use long long. Otherwise use long. I guess this is the most portable solution we can have without introducing "arbitrary" limits...

@eggert
Copy link
Contributor

eggert commented Jan 29, 2023

Inspired by this PR, I created PR #331 that I think does everything this PR does, and in addition addresses the above comments and fixes some other integer overflows as well, as well as replacing less's only use of floating point with exact integer arithmetic that does not overflow. Although PR #331 doesn't fix all the integer-overflow issues in less I hope it fixes the most likely ones.

@gwsw
Copy link
Owner

gwsw commented Jan 30, 2023

Superseded by #331.
@eggert, I made some small style changes and fixed a few places where C99 syntax was used. Let me know if you see any issues with my changes.

@gwsw gwsw closed this Jan 30, 2023
@eggert
Copy link
Contributor

eggert commented Jan 30, 2023

Thanks, looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants