Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tarantool JIT engine #17

Open
wants to merge 550 commits into
base: tarantool
Choose a base branch
from

Conversation

Brugarolas
Copy link
Owner

No description provided.

Mike Pall and others added 30 commits March 10, 2023 17:45
Thanks to Patrick Galizia.

(cherry picked from commit b33e3f2)

Constant rematerialization must not use other registers that contain
constants, if the register is in use. When we have the high register
pressure we can face the following issue:

The assembly of an IR instruction allocates a constant into a free
register. Then it spills another register (due to high register
pressure), which is rematerialized using the same constant (which it
assumes is now in the allocated register). In case when the first
register also happens to be the destination register, the constant value
is modified before the rematerialization.

For the code in the test for this commit we get the following register
allocation order (read from top to bottom (DBG RA reversed)):
| current IR | operation | IR ref | register
|  0048         alloc       0038     x0
|  0048         remat       K038     x0
|  0048         alloc       K023     x4

Which leads to the following asembly:
| ...
| add   x4, x4, x0    # x4 modified before x0 rematerialization
| ldrb  w4, [x4, LuaJIT#24]
| add   x0, x4, LuaJIT#24   # constant x0 rematerialization
| ...
As a result, the value register x0 holding is incorrect.

This patch moves allocation of constants earlier to be sure that the
rematerialization can not make use of the same constant as one of the
sources of the IR instruction.

After the patch register allocation order is the following:
| current IR | operation | IR ref | register
|  0048         alloc       K023     x4
|  0048         alloc       0038     x0
|  0048         remat       K038     x0

Also, this patch fixes the `asm_fusexref()` logic for the `IR_STRREF` in
case, when both operands don't fit in 32-bit constants (`asm_isk32()`
fails). We want to use the IR operand holding the referenced value in
`ra_alloc1()` as one having the hint set (`ra_hashint()` check passes).
It is set for the operand with a non constant value (`irref_isk()`
fails). The code assumes that it is always the `ir->op1` operand, so for
the case when this value holds `ir->op2` operand the register allocator
misses the aforementioned hint in `ir->op2`. As the result the wrong
register is selected. This patch adds the corresponding `irref_isk()`
check for the `ir->op1` to detect which operand contains the value with
the hint.

After the patch the resulting assembly is the following:
| ...
| add   x4, x0, x4
| ldrb  w4, [x4, LuaJIT#24]
| add   x0, x1, LuaJIT#112
| ...

As we can see the constant is rematerialized from another, non-modified
register.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8069

Reviewed-by: Sergey Ostanevich <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
(cherry-picked from commit 90e6551)

The C library loader uses `dlopen` under the hood, which fails, if
provided library path is longer than PATH_MAX. PATH_MAX is
4096 bytes by default, so a corresponding check is added to
`ll_loadfunc`.

Maxim Kokryashkin:
* added the description and the test for the problem

Part of tarantool/tarantool#8069

Reviewed-by: Sergey Kaplun <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
(cherry picked from commit 7e662e4)

The accessing of memory address for some operation `emit_rma()` may be
encoded in one of the following ways:
 a. If the offset of the address being accessed from the dispatch table
    (pinned to r14 that is not changed during trace execution) fits into
    32-bit, then encode this as an access to 32-bit displacement
    relative to r14.
 b. If the offset of the address being accessed from the mcode (i.e.
    rip) fits into 32-bit, then encode this as an access to 32-bit
    displacement relative to rip (considering long mode specifics and
    `RID_RIP` hack).
 c. If the address doesn't fit into 32-bit one and we use `mov` or
    `movsd`, then encode 64-bit load from this address.
 d. Elsewhere, encode it as an access to 32-bit (the address should fit
    into 32-bit one) displacement (the only option for non-GC64 mode).

So, each instruction in GC64 mode that differs from `mov` or `movsd`
should be encoded via the last option. But if we get a 64-bit address
with a big enough offset, it can't be encoded, so the assertion in
`ptr2addr()` fails.

There are several cases, when `emit_rma()` is used with a non-`mov`
instruction:
* `IR_LDEXP` with `fld` instruction for loading constant
   number `TValue` by address.
* `IR_OBAR` with the corresponding `test` instruction on
  `marked` field of `GCobj`.
All these instructions require an additional register to store value by
address. We can't truly allocate a register here due to the possibility
to break IR assembling which depends on specific register usage. So, we
use and restore r14 here for emitting.

Also, this patch removes `movsd` from condition from the `x86Op` type
check, as far as it is never used for the `emit_rma()` routine (see also
`emit_loadk64()` for details).

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8069

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
It is impossible to run gdb on M1 devices, the only available
debugger is lldb. The luajit-gdb extension doesn't work with
lldb, so this patch introduces the luajit-lldb extension,
which re-implements exactly the same functionality.

It is worth noting that the extension is called
`luajit_lldb.py` instead of `luajit-lldb.py` because lldb
imports it as a Python module, and it is prohibited to use
dashes in a python module name.

Part of tarantool/tarantool#4808

Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Reported by Yichun Zhang.

(cherry picked from commit 850f8c5)

The `ASMREF_L` reference is defined as `REF_NIL`, so it isn't considered
as a 64-bit address. On GC64 mode it may lead to the following assembly:
| mov eax, edi
so, the high 32 bits of the reference are lost.

This patch adds `IRT_NIL` to `IRT_IS64` mask, to consider `ASMREF_L`
64 bit long. Now the resulting assembly is the following:
| mov rax, rdi

False-positive `if` condition in <src/lj_asm.c> is OK, since `op12`
already initialized as 0.

False-positive `if` condition in <src/lj_opt_sink.c>, <src/lj_opt_split.c>,
<src/lj_record.c> is OK, since `REF_NIL` is the last reference before
`REF_BASE` and this iteration of a cycle is still the last one.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8516

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Ostanevich <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
(cherry-picked from commit a32aead)

This patch introduces handling of errors from internal helper
functions on traces. FFI C++ exception interoperability is
not yet implemented.

For each throwing trace, its mcode entry is augmented with a
DWARF2 frame description entry and a common information entry.
After that, a dynamic DWARF2 frame info is registered based on
that entry with `__register_frame()`[1][2], which is just an
adapter to `__unw_add_dynamic_fde`[3] on OSX. Because the ARM32
architecture lacks the `__register_frame`, unwinding is not
supported on it.

It is important to notice, that both the CIE and FDE are
used on Linux, and only the FDE is used on OSX. The
CIE is unnecessary on OSX, which can be clearly seen in
the sources[3] of Apple's libunwind: there is an attempt
to parse it, however its data is unused. In the same time,
the CIE is required on Linux[4] to perfrom dynamic frame
registration.

For each throwing function call, a snapshot is allocated.
When we have a parent trace, our side trace head requires
an additional snapshot allocation, so the additional
`asm_snap_prev()` call is added.

The `lj_err_trace()` is introduced to use instead
`lj_err_run()` for throwing the error on trace.

The following fields were added to the ASMState structure:
* `snapalloc` -- flag showing whether the current snapshot
needs allocation.
* `mctoporig` -- holds the pointer to the top of the
generated mcode, including the DWARF entries, if present.

And the following fields were added to the SnapShot structure:
* `mcofs` -- offset into machine code in  MCode units,
needed to skip the DWARF entries, if present.
* `exitcode` -- exit code from unwound trace.

The following registers were chosen to act as EHRAREG
(Exception Handler Return Address Register) on each platform:
* X86 `eip` (8)
* X64 `rip` (16)
* ARM `lr` (14)
* ARM64 `lr` (30) maps to x30
* PPC `lr` (65) maps to SPR8
* MIPS `$ra` (31) maps to $31

Also, introduction of `lj_err_trace` changes the semantics of
`lj-603-err-snap-restore.test.lua`, since now those errors are handled
on trace. The test was modified corresponding to the updates.

Maxim Kokryashkin:
* added the description and the test for the problem

Part of tarantool/tarantool#7745
Part of tarantool/tarantool#8069

[1]: https://github.com/gcc-mirror/gcc/blob/ce83c3e492c2fa5a08c15b5f4619d58f42a5dcd0/libgcc/unwind-dw2-fde.c#L149
[2]: https://opensource.apple.com/source/libunwind/libunwind-201/libunwind/src/UnwindLevel1-gcc-ext.c.auto.html
[3]: https://opensource.apple.com/source/libunwind/libunwind-201/libunwind/src/libunwind.cpp.auto.html
[4]: https://github.com/gcc-mirror/gcc/blob/ce83c3e492c2fa5a08c15b5f4619d58f42a5dcd0/libgcc/unwind-dw2-fde.c#L711

Reviewed-by: Sergey Kaplun <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Broken on Fedora/ARM64. Reported by Yichun Zhang.

(cherry-picked from commit e957737)

This patch disables the assertion that failed because of
incorrectly constructed unwind information. That debug
info generation was fixed in the scope of
tarantool/tarantool#6096 with commit
b731df1 ("ARM64: Reorder
interpreter stack frame and fix unwinding.")
This patch is backported only for consistency.

Maxim Kokryashkin:
* added the description for the problem

Part of tarantool/tarantool#7745
Part of tarantool/tarantool#8069
Relates to tarantool/tarantool#6096

Reviewed-by: Sergey Kaplun <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
(cherry-picked from commit be251d9)

`_Unwind_Find_FDE()` will locate the FDE if the pc is in some
function that has an associated FDE. Note, Mac OS X 10.6 and
later, introduces "compact unwind info" which the runtime uses in
preference to DWARF unwind info. This function will only work if
the target function has an FDE but no compact unwind info.
The LuaJIT VM produces compact unwind info entries, so
DWARF's FDE is not found and that assertion fails.

Maxim Kokryashkin:
* added the description for the problem

Part of tarantool/tarantool#7745
Part of tarantool/tarantool#8069

Reviewed-by: Sergey Kaplun <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Reported by Victor Bombi, analyzed by XmiliaH. Thanks!

(cherry-picked from commit bf51d35)

If the `snapalloc` flag is set, then the allocation has not
occurred yet, meaning that rename is applied to the next
snapshot. Otherwise, refs are already allocated and rename
is applied to the current snapshot.

Maxim Kokryashkin:
* added the description and the test for the problem

Part of tarantool/tarantool#7745
Part of tarantool/tarantool#8069

Reviewed-by: Sergey Kaplun <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Prior to the patch using LUAJIT_USE_GDBJIT CMake option led to the build
failure due to the typo made in src/CMakeLists.txt in the very first
commit b8b0c41 ("build: replace GNU
Make with CMake") introducing CMake infrastructure. As a result of this
patch, everything works fine now.

Signed-off-by: Igor Munkin <[email protected]>
When we set `LUA_TEST_ENV_MORE` variable to be used in the additional
env command for run testing if `"` is used to wrap the `LD_LIBRARY_PATH`
value the content of this environment variable is literally
`"/abs/path1:/abs/path2:...:"`. So, the first entry is treated as the
relative path starting with `"`. In that case if we need the library to
be loaded via FFI for this particular test, that loading fails with the
error "cannot open shared object file", since the path to it is
incorrect.

This patch removes `"` wrapping for the aforementioned variables.

Without this patch, the rest patchset will lead to persistent failing of
the <tarantool-tests/lj-flush-on-trace.test.lua> as far as its
subdirectory (with helper C library for it) mentioned as the last one.

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
We need an instrument to write tests in plain C for LuaJIT, to be able:
* easily test LuaC API
* test patches without usage of plain Lua
* write unit tests
* startup LuaJIT with custom memory allocator, to test some GC issues
* maybe, in future, use custom hashing function to test a behavior
  of LuaJIT tables and so on.

The <test.c> module serves to achieve these goals without too fancy
features.

It's functionality inspired by CMocka API [1], but only TAP14 [2]
protocol is supported (Version of TAP set to 13 to be compatible with
old TAP13 harnesses).

The group of unit tests is declared like the following:

| void *t_state = NULL;
| const struct test_unit tgroup[] = {
| 	test_unit_def(test_base),
| 	test_unit_def(test_subtest),
| };
| return test_run_group(tgroup, t_state);

`test_run_group()` runs the whole group of tests, returns
`TEST_EXIT_SUCCESS` or `TEST_EXIT_FAILURE`.

If a similar group is declared inside unit test, this group will be
considered as a subtest.

This library provides an API similar to glibc (3) `assert()` to use
inside unit tests. `assert_[true,false]()` are useful for condition
checks and `assert_{type}_[not_,]_equal()` are useful for value
comparisons. If some assertion fails diagnostic is set, all test
considered as failing and finished via `longjmp()`, so these assertions
can be used inside custom subroutines.

Also, this module provides ability to skip one test or all tests, mark
test as todo, bail out all tests. Just use `return skip("reason")`,
`skip_all("reason")` or `todo("reason")` for early return. They should
be used only in the test body to make skipping clear.
`skip_all("reason")` may be used both for the parent test and for a
subtest. `bail_out("reason")` prints an error message and exits the
process.

As a part of this commit, tarantool-c-tests directory is created with
the corresponding CMakeLists.txt file to build this test library.
Tests to be rewritten in C with this library in the next commit and
placed as unit tests are:
* lj-49-bad-lightuserdata.test.lua
* misclib-getmetrics-capi.test.lua
* misclib-sysprof-capi.test.lua

For now the tarantool-c-tests target just build the test library without
new tests to run.

The library itself is tested via some primitive tests for `ok` case,
`skip` and `todo` directives. The TAP13 format is tested via prove, that
we are using for running our tests. TAP14 format is compatible with
TAP13, so there are no other tests required.

Also, .c_test suffix is added to the <.gitignore>.

[1]: https://github.com/clibs/cmocka
[2]: https://testanything.org/tap-version-14-specification.html

Part of tarantool/tarantool#7900

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
This header contains generic init and close functions for tests and
helpers for loading auxiliary Lua script with functions to run inside a
C test. See more information in the <README.md> file.

It will be properly used in the next commit.

Part of tarantool/tarantool#7900

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
This patch rewrites the aforementioned test with the usage libtest,
recently introduced. Since we still stand in need of a Lua helper script
for generation of traces, the original test file is reworked as a
standalone script, which returns the table with helper functions. Each
helper function is named the same as the test where it will be used. Now
single quotes are used according to our Lua code style.

In C part all asserts from glibc are replaced with the corresponding
assert_{cond} from the libtest. Now tests return `TEST_EXIT_SUCCESS` at
the finish. Also, the stack check for the amount of return results from
the helper function is slightly changed, since there is one more stack
slot in use (table with these functions). `snap_restores()` C function
duplicates 4 times for each subtest. Common helper
`check_snap_restores()` is used for each of them. Each error throwing is
replaced with `bail_out()` call.

NB: `lua_pop()` to clear the Lua stack after a call should be done before
any possible assertion, which would exit from the test leaving the stack
uncleaned.

All skipconds use macros defined in <lj_arch.h>, so it is included in
the test. As far as this test initializes the LuaJIT VM manually, there
is no need to check `jit.status()` result; checking `LJ_HASJIT` is
enough. Now, only JIT-related tests are skipped, when compiled without
JIT. Nevertheless, all tests are skipped for *BSD arches.

Also, this patch sets the new CMake variable named `LUAJIT_LIBRARY`
equal to `LUAJIT_LIB` in `PARENT_SCOPE` to be used in tarantool-c-tests
linking.

Follows up tarantool/tarantool#7900

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
This patch rewrites the aforementioned test with the usage libtest
recently introduced. The approach is similar to the previous patch.
`c_payload()` function to profile is now set up globally on script
loading. The payload for enabled and disabled JIT is the same. As far as
for sysprof testing is necessary to replace backtrace with libunwind
first, the payload tests are marked with `todo()`.

Nevertheless, glibc `assert()` still necessary to check the correctness
of the profile writer, so it is still used there.

Follows up tarantool/tarantool#7900
Relates to tarantool/tarantool#781

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
This patch rewrites the aforementioned test with the usage libtest
recently introduced. The approach is similar to the previous patch.

Nevertheless, glibc `assert()` is used to check the correctness
of the `mmap()` usage.

Follows up tarantool/tarantool#7900

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
The next patch introduces a separate JIT-related module with convenient
utils for JIT engine testing. Considering this change it looks vital to
make a structured utils distributed module instead of "all in one" Lua
chunk. As a result, the original utils.lua is split into the several
modules per subsystem to be tested (e.g. GC, frontend, profilers, etc.).

Lazy loading of the introduced submodules allows to use this utils in
all test chunks regardless LuaJIT configuration (e.g. with the JIT
engine disabled, without FFI support, etc) and do not spoil utils table
with the excess helpers.

Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
This commit adds a simple parser for traces to be analyzed in the test.
For now nothing too fancy at all -- just start `jit.dump` to a temporary
file and parse and remove this file when dump is stopped. The array with
resulting traces is returned.

For now, just find a single IR by pattern and return ref number and IR,
if they exist. More functionality may be added if it is necessary for
tests.

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Reported by Sergey Kaplun.

(cherry picked from commit 96fc114)

This commit is a follow-up for the commit
f067cf6 ("Fix narrowing of unary
minus."). Since this commit -0 IR constant is stored as well as the +0
constant on the trace. Since IR NEWREF keys aren't canonicalizied for -0
as opposed to IR HREFK, it may lead to inconsistencies during trace
recording.

In particular, since -0 and 0 are different IR constants, alias analysis
declares that they can't be aliased during folding optimization.
Therefore:
1) For the IR TNEW we have non-nil value to lookup from the table via
   HLOAD, when only nil lookup is expected due to alias analysis.
2) For the IR TDUP we have non-nil value to lookup from the table via
   HLOAD, but the template table has no 0 field initiated as far as -0
   isn't folding to 0 during parsing (see `bcemit_unop()` in
   <src/lj_parse.c>).
These cases lead to assertion failures in `fwd_ahload()`.

This patch adds the aforementioned canonicalization.

Sergey Bronnikov:
* reported the original issue for the TDUP IR

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8516

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Reported by Philipp Kutin.
Fix contributed by Peter Cawley.

(cherry-picked from commit 03a7ebc)

In a situation where a variable shift left bitwise rotation that
has a 64-bit result is recorded on an x86 64-bit processor and
the result is supposed to end up in the `rcx` register, that value
could be written into the `ecx` instead, thus being truncated into
32 bits. This patch fixes the described behavior, so now that
value is written into the `rcx`.

Resulting assembly changes from the following before the patch:
| rol rsi, cl
| mov ecx, esi

to the following after the patch:
| rol rsi, cl
| mov rcx, rsi

Importantly, the same behavior is impossible with the right
rotation on machines with BMI2 support because there is a
BMI2 instruction for it, so it is handled differently.

Maxim Kokryashkin:
* added the description and the test for the problem

Part of tarantool/tarantool#8516

Reviewed-by: Sergey Kaplun <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Thanks to Dimitry Andric.

(cherry picked from commit 7dbf0b0)

Function `bcsave.lua:bcsave_elfobj()` produced an object file in ELF
format with a wrong size size of `.strtab`. Wrong .strtab size causes
lld to show an error message:

```
$ luajit -b -n "module_name" -e "print()" xxx.obj
$ ld.lld xxx.obj
ld.lld: error: xxx.obj: SHT_STRTAB string table section [index 3] is non-null terminated
```

Starting null byte (\0) was forgotten to count, the patch fixes the
counting.

Sergey Bronnikov:
* added the description and the test for the problem

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Reported by Jason Carr.

(cherry picked from commit dd5032e)

When we call `lua_yield()` from the C hook, an additional continuation
frame is added. This frame contains a continuation function, PC where we
should return, a thread GC object to continue, and the frame type and
size (see details in <src/lj_frame.h>). For non-GC64 mode, when we set
the GC thread on the Lua stack, the stack top isn't incremented, so the GC
thread overwrites the PC to return. For the GC64 mode the increment is
missing before setting frame type and size.

This patch fixes the behaviour by adding the missing slot
incrementation. Also, it hardens the conditions of using
`lj_err_throw()`, according to the availability of external unwinder.

The behaviour for the GC64 mode is still wrong due to a miscalculation
of the slot of the GC thread object. This will be fixed in the next
commit.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8516

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Reported by Jason Carr.

(cherry picked from commit dd0f09f)

This commit is the follow up for the previous commit ("Fix lua_yield()
from C hook."). In GC64 mode stack slot for a GC thread object is still
miscalculated during the creation of a continuation frame for
`lua_yield()`. This happens due to the tricky usage of the previous slot
instead of the given one in `setframe_gc()` macro.

This patch changes the semantics of `setframe_gc()` macro to use the
slot given as argument as the destination to store GC value. Also, it
fixups all usages of this macro to match the new semantics, i.e. in:
* <src/lj_ccallback.c>
* <src/lj_err.c>
* <src/lj_meta.c>

Sergey Kaplun:
* added the description for the problem

Part of tarantool/tarantool#8516

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
(cherry-picked from commit 646148e)

Before the patch `mmap_probe` only checked if the allocated chunk
start was within the 2^LJ_ALLOC_MBITS bytes region. However, if the
chunk is big enough, its end can reach outside of that region. This
patch adds the corresponding check, to avoid such situations.

Maxim Kokryashkin:
* added the description and the test for the problem

Part of tarantool/tarantool#8516

Reviewed-by: Sergey Kaplun <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
(cherry picked from commit d4c0c6e)

During loop unrolling IR instructions are copied, substituted and
re-emitted for the FOLD/CSE/etc. pipeline. Assume, we have the following
loop-carried dependency: we create a new table or load the stored one
depending on the iteration number. If we copy the emitted ALOAD IR
instruction (from the stored table), we copy it as is, including load
type. But the ALOAD from the new table created (on the previous
iteration) should have nil type, so the assertion in `fwd_ahload()` is
failed.

This patch replaces the assertion with `return 0` to avoid the assertion
failure and stop forwarding in such situations.

But the emitted type-guarded ALOAD will lead to the persistent failure
of the assertion guard on the trace. Also, TDUP IR is affected. See also
the issue mentioned in the test.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8516

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Reported by Ryan Lucia.

(cherry-picked from commit 2801500)

Use-def analysis for BC_VARG has too strong limit for the top/maxslot,
so no slots may be considered as used. This leads to an additional
SLOAD on the trace with an incorrect value used later. This patch
disables the use-def analysis for BC_VARG as NYI.

Sergey Kaplun:
* added the description and the test for the problem

Resolves tarantool/tarantool#8718
Part of tarantool/tarantool#8516

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Reported by Shmuel Zeigerman.

(cherry-picked from commit 0e53a31)

Use-def analysis for BC_FUNCV may consider slots greater than the amount
of non-vararg parameters as dead slots due to the early return (see case
`BCMlit`) for BC_RET or BC_JMP emitted before usage of BC_VARG. This
patch restricts the maxslot to be analyzed in such case with the amount
of parameters for the prototype of the current function being recorded.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8516
Relates to tarantool/tarantool#8718

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
This commit is the follow-up for
5570eb8 ("test: add utility for parsing
`jit.dump`")

This test is flaky on GC64 build due to some unpredictable JIT
compilation hotcount and trace recording.

This patch adds `jit.off()`/`jit.on()` calls to be sure that the JIT is
enabled only for the particular part of the test, where we collect the
trace dump.

Reviewed-by: Igor Munkin <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
As a result of branch renaming in LuaJIT repository, the existing job
concurrency group definition rule has become outdated. This patch
updates the condition according to the new branch naming policy.

Reviewed-by: Sergey Kaplun <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Contributed by XmiliaH.

(cherry-picked from commit 93a65d3)

Patch fixes a problem when LuaJIT generates a wrong bytecode with a
missed BC_UCLO instruction. When some of BC_RET bytecode instructions are
not fixup-ed, due to an early return, if UCLO is obtained before, those
leads to VM inconsistency after return from the function.

Patch makes the following changes in bytecode (thats it, emits extra
BC_UCLO instruction that closes upvalues):

@@ -11,11 +11,12 @@
 0006 => LOOP     1 => 0012
 0007    ISF          0
 0008    JMP      1 => 0010
-0009    RET1     0   2
+0009    UCLO     0 => 0014
 0010 => FNEW     0   0      ; uclo.lua:56
 0011    JMP      1 => 0006
 0012 => UCLO     0 => 0001
 0013 => RET0     0   1
+0014 => RET1     0   2

NOTE: After emitting the bytecode instruction BC_FNEW fixup is not
required, because FuncState will set a flag PROTO_CHILD that will
trigger emitting a pair of instructions BC_UCLO and BC_RET (see
<src/lj_parse.c:2355>) and BC_RET will close all upvalues from a base
equal to 0.

JIT compilation of missing_uclo() function without a patch with fix is failed:
src/lj_record.c:135: rec_check_slots: Assertion `((((((tr))>>24) & IRT_TYPE) - (TRef)(IRT_NUM) <= (TRef)(IRT_INT-IRT_NUM)))' failed.
(Thanks to Sergey Kaplun for discovering this!)
Thus second testcase in a test covers a case with compilation as well.

Sergey Bronnikov:
* added the description and the test for the problem

Part of tarantool/tarantool#8825

Helped-by: Sergey Kaplun <[email protected]>
Signed-off-by: Sergey Bronnikov <[email protected]>
Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
ligurio and others added 5 commits April 1, 2024 21:17
The patch replaces the main test runner `prove(1)` with CTest, see [1]
and [2].

Build and test steps look the same as before:

$ cmake -S . -B build
$ cmake --build build --parallel
$ cmake --build build --target [LuaJIT-test, test]

CMake targets lua-Harness-tests, LuaJIT-tests, PUC-Rio-Lua-5.1-tests,
tarantool-tests and tarantool-c-tests are still operating.

The CMake custom target `test` was replaced by a target with the same
name that is automatically generated by CMake. It is not possible to
attach targets to this automatically generated `test` target.
It means that target `test` is now executing `LuaJIT-test`, but not
`LuaJIT-lint`. To mitigate this, a new target `LuaJIT-check-all` is
introduced. It has the same behaviour as an old `test` target: it runs
all functional tests and linters.

One could use `ctest` tool:

$ ctest --print-labels
Test project <snipped>
All Labels:
  LuaJIT-tests
  PUC-Rio-Lua-5.1-tests
  lua-Harness-tests
  tarantool-c-tests
  tarantool-tests
$ ctest                          # Run all tests.
$ ctest -L tarantool-tests       # Run tests by label.
$ ctest -L tarantool-c-tests     # Ditto.
$ ctest -L lua-Harness-tests     # Ditto.
$ ctest -L LuaJIT-tests          # Ditto.
$ ctest -L PUC-Rio-Lua-5.1-tests # Ditto.
$ ctest --rerun-fail             # Run only the tests
                                 # that failed previously.
$ ctest --verbose                # Print details to stdout.
$ ctest --output-on-failure      # Print details to stdout
                                 # on test failure only.

The benefits are:

- less external dependencies, `prove(1)` is gone, `ctest` is a part of
  CMake
- running tests by test title
- extended test running options in comparison to prove(1)
- unified test output (finally!)

Note that the testsuites in `test/LuaJIT-tests` and
`test/PUC-Rio-Lua-5.1` directories have their own test runners written
in Lua, and currently CTest runs these testsuites as single tests. In
the future, we could change these test runners to specify tests from
outside, and after this, we could run tests separately by CTest in these
test suites.

Note that it is not possible to add dependencies to `add_test()` in
CMake, see [3]. CMake 3.7 introduces FIXTURES_REQUIRED [4] and
FIXTURES_SETUP [5], but these test properties cannot be used - the
currently supported CMake version is lower. This is workarounded by the
introduced macro `add_test_suite_target` that adds a CMake-test for each
testsuite that executes a target that builds test dependencies.

The commit introduces the CMake option LUAJIT_TEST_DEPS that sets
dependencies to be used for running and building LuaJIT tests.

1. https://cmake.org/cmake/help/latest/manual/ctest.1.html
2. https://cmake.org/cmake/help/book/mastering-cmake/chapter/Testing%20With%20CMake%20and%20CTest.html
3. https://gitlab.kitware.com/cmake/cmake/-/issues/8774
4. https://cmake.org/cmake/help/latest/prop_test/FIXTURES_REQUIRED.html
5. https://cmake.org/cmake/help/latest/prop_test/FIXTURES_SETUP.html

Reviewed-by: Igor Munkin <[email protected]>
Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Thanks to Sergey Kaplun.

(cherry picked from commit cae3611)

Assume we have the root trace that uses some spill slots with the
corresponding stack adjustment. Then its side trace will restore the
stack only at its tail. It may look like the following:

| ---- TRACE 4 mcode 1247
| 55557f7df953  mov rax, [r14-0xe28]
| 55557f7df95a  mov rax, [rax+0x30]
| 55557f7df95e  sub rax, rdx
| 55557f7df961  cmp rax, +0x68
| 55557f7df965  jb 0x55557f7d004c       ->0
| 55557f7df96b  add rsp, -0x10
| ...
| 55557f6efa71  cmp dword [rdx+0x4], -0x05
| 55557f6efa75  jnz 0x55557f6e004c      ->0
| ...
| 55557f7dfe29  add rsp, +0x10
| 55557f7dfe2d  jmp 0x5555556fe573
| ---- TRACE 4 stop -> stitch
|
| ---- TRACE 5 start 4/0
| ---- TRACE 5 mcode 101
| 55557f6ef9d4  mov dword [0x40000518], 0x5
| ...
| 55557f6efa30  add rsp, +0x10
| 55557f6efa34  jmp 0x55557f6ef9d4
| ---- TRACE 5 stop -> down-recursion

Such side traces have no stack adjustment at their heads since their
stack adjustment is inherited from the parent trace. The issue occurs if
the side trace has a down-recursion, as mentioned above. Before any
exit, we can jump back to the start of the trace several times with
growing `rsp`. In that case, the `rsp` is restored incorrectly after
exiting from the trace.

This patch forbids down-recursion for non-root traces.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9595

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
…e().

Thanks to Sergey Kaplun.

(cherry picked from commit 88ed9fd)

When we use the aforementioned functions to set a metatable for types
with one shared metatable, we must flush all traces since they are
specialized to base metatables. If we have enabled vmevent handlers,
they invoke a callback on trace flushing. This callback may reallocate
the Lua stack. Thus invalidates the reference to the `TValue *` object
`o` by the given index in the `lua_setmetatable()` and leads to a
heap-use-after-free error.

This patch fixes the behaviour by recalculating the address by the given
index after possible stack reallocation.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9595

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Prior to this patch, there was no error-checking in profilers,
which resulted in raw Lua errors being reported in cases of
non-existing paths or corrupt file structures. This patch
makes use of the event reader module introduced in commit
df5a072 ("profilers: introduce
event reader module") to add error handling, so all parsing
errors are now reported in a user-friendly manner.

Tool CLI flag tests are adapted correspondingly to error message
changes.

Resolves tarantool/tarantool#9217
Part of tarantool/tarantool#5994

Reviewed-by: Sergey Kaplun <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Reported by Yichun Zhang.

(cherry picked from commit b279117)

This patch is a follow-up to the commit
5f0a43a ("bugfix: fixed assertion
failure "lj_record.c:92: rec_check_slots: Assertion `nslots <= 250'
failed" found by stressing our edgelang compiler."), which is identical
to the commit e0388e6 ("Fix stack check
when recording BC_VARG.)" from the upstream. The error is raised too
late, when buffer overflow of `J->slot` has already occurred and data in
the `jit_State` structure is corrupted.

This patch moves the corresponding check before using the `J->slot`
buffer. The `J->maxslot` may overflow the buffer only in cases where the
amount of the vararg results is unknown. The check is used only in this
case since the trace recording for the undefined-on-trace varargs is not
yet implemented for an unknown amount of varargs.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Copy link

coderabbitai bot commented Apr 24, 2024

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

ligurio and others added 24 commits June 15, 2024 08:53
The patch defines `_TARANTOOL` as a global in the luacheck configuration
file and removes inline suppressions in test files.

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Reported by Eric Gouyer.

(cherry picked from commit 36b2962)

According to C++ Standard (5.3.6.3) [1], the `alignof()` for the
reference should be the same as for the referenced type. This patch
fixes the behaviour by following the reference to get a child id for
`ffi.alignof()`.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

[1]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4594.pdf#subsection.5.3.6

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
(cherry picked from commit 899093a)

According to C++ Standard, the `alignof()` (5.3.6.3) [1] and `sizeof()`
(5.3.3.2) [2] for the reference should be the same as for the referenced
type. This patch fixes the behaviour by following the reference to get a
child id for `alignof()` and `sizeof()` while parsing C definitions via
`ffi.cdef()`.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

[1]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4594.pdf#subsection.5.3.6
[2]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4594.pdf#subsection.5.3.3

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Reported by Eric Gouyer.

(cherry picked from commit 27a6fee)

Before the patch, `ffi.metatype()` raises an error when calling it on
ctype with attributes. This patch fixes the behaviour by using
`ctype_raw()` instead of `ctype_get()` to follow child ctypes unless
there are no attributes on the ctype.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Reported by 999pingGG.

(cherry picked from commit 4c35a42)

This patch is a follow-up to the previous commit. Since the child id is
used for the index in the lookup table, it has no effect, and metatype
methods are not overloaded. This patch fixes the behaviour by using the
correct index.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Thanks to Peter Cawley.

(cherry picked from commit f72c19e)

Instructions with strong guards that are sometimes emitted with a guard
and sometimes emitted without a guard (like HREFK, CONV, or SLOAD) may
be eliminated from the IR chain and replaced with the NOP IR. If the
next IR of the same kind on the trace is not eliminated, it may
reference the IR NOP instead of an instruction of the same type. This
may lead to the corresponding assertion failure in the `rec_check_ir()`.

This patch unconditionally links the IRs during chain maintenance in
DCE.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
This option enables table bump optimization if sink optimization is
enabled. The table bump optimization patches the bytecodes with a table
allocation on the trace recording if the recorded trace exceeds the size
of the allocated table. This optimization still has some bugs, so it is
disabled by default. For more details, see the comment in
<CMakeLists.txt>.

Needed for tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
This flavor enables the LUAJIT_ENABLE_TABLE_BUMP option to test table
bump optimization.

Needed for tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Now information about the abort of the trace is saved in the
`abort_reason` field of the corresponding structure. The
`jit.parse.finish()` returns now the second table containing aborted
traces. Each table key is a trace number containing an array of
potentially traces with this number, which was aborted.

Needed for tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Thanks to Sergey Kaplun and Peter Cawley.

(cherry picked from commit d06beb0)

This commit is a follow-up for the commit
1b82160 ("Throw any errors before stack
changes in trace stitching."). The patch prepends failures for the
specific error to be thrown. Nevertheless, the error may be thrown due
to retrying trace recording in the case when table bump optimization
is enabled or when OOM is observed during reallocation of the snapshot
or IR buffers.

This patch adds the corresponding protected frame and rethrows the error
after a fixup of the stack.

This patch also tests the correctness of copying the error message to
the top of the stack to get a valid "abort" reason in the `jit.dump`
utility.

Also, this patch fixes a non-ASCII space character in the comment for
<lj-720-errors-before-stitch.test.lua>.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Thanks to Sergey Kaplun.

(cherry picked from commit b8b49bf)

The previous commit doesn't handle the case when the error code is
`LUA_ERRMEM`. This patch adds a workaround by using the generic error
message.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Thanks to Peter Cawley.

(cherry picked from commit 6585305)

The `lj_opt_fwd_wasnonnil()` skips the check for HREF and HREFK that may
alias. Hence, the guard for the non-nil value may be skipped, and the
`__newindex` metamethod call is omitted too.

This patch adds the aforementioned check for different reference types
(HREF vs. HREFK), which were not detected by the previous analysis.
Also, the helper macro `irt_isp32()` is introduced to check that the IR
type is `IRT_P32` (KSLOT type).

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
The commit 049e296 ("test: run LuaJIT
tests via CMake") introduced CMake build infrastructure and added
`cmake_minimum_required` to each CMakeLists.txt in the project. However,
it is not required: `cmake_minimum_required` specified in a root
CMakeLists.txt is more than enough. The patch leaves
`cmake_minimum_required` in a root CMakeLists.txt and removes it in
other CMakeLists.txt below in a source tree.

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Since CMake 3.27 compatibility with versions of CMake older than
3.5 is deprecated [1]. CMake produces an annoying warning on the
configuration stage:

| CMake Deprecation Warning at src/CMakeLists.txt:7 (cmake_minimum_required):
|  Compatibility with CMake < 3.5 will be removed from a future version of
|  CMake.

We cannot bump a minimum required CMake version without bumping it in
the Tarantool build system. However, we can set a <policy_max>
(introduced in CMake 3.12, see [1]) and suppress a warning. <policy_max>
means that this CMake version is known to work properly and will not
result in any build errors right away for higher versions.

Note that the current CMake minimum required version in Tarantool is
equal to 2.8, but <policy_max> is introduced in CMake 3.12 [1].
However, according to [1] it is not a problem because if CMake is "older
than 3.12, the extra ... dots will be seen as version component
separators, resulting in the ...<max> part being ignored and preserving
the pre-3.12 behavior of basing policies on <min>".

<policy_max> is set to 3.18 because compatibility with versions of CMake
older than 2.8.12 is deprecated. Calls to
cmake_minimum_required(VERSION) that do not specify at least 2.8.12 as
their policy version (optionally via ...<max>) will produce a
deprecation warning in CMake 3.19 and above [2]. Compatibility with
2.8.12 is needed for CMP0002 [3], see commit 049e296 ("test: run
LuaJIT tests via CMake").

1. https://cmake.org/cmake/help/latest/command/cmake_minimum_required.html
2. https://cmake.org/cmake/help/latest/command/cmake_minimum_required.html#policy-settings
3. https://cmake.org/cmake/help/latest/policy/CMP0002.html

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
This commit adds a workflow for building and testing with AVX512
enabled.

Needed for tarantool/tarantool#9924
Relates to tarantool/tarantool#6787

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
The test `lj-366-strtab-correct-size.test.lua` has a test helper
`read_file` that reads a file's content and returns it.
This helper will be useful for a test upcoming in the next commit,
so it is moved to test tools.

Needed for tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Thanks to Carlo Cabrera.

(cherry picked from commit 3065c91)

The Mach-O FAT object constructed by LuaJIT had an incorrect format.
The problem is reproduced when the target hardware platform has
AVX512F and LuaJIT is compiled with enabled AVX512F instructions.

The problem arises because LuaJIT FFI code for Mach-O file
generation in `bcsave.lua` relies on undefined behavior for
conversions to `uint32_t`. AVX512F has the `vcvttsd2usi`
instruction, which converts `double`/`float` to `uint32_t/uint64_t`.
Earlier architectures (SSE2, AVX2) are sorely lacking such
an instruction, as they only support signed conversions. Unsigned
conversions are done with a signed convert and range shifting -
the exact algorithm depends on the compiler. A side effect of
these workarounds is that negative `double`/`float` often
inadvertently convert 'as expected', even though this is invoking
undefined behavior. Whereas `vcvttsd2usi` always returns
0x80000000 or 0x8000000000000000 for out-of-range inputs.

The patch fixes the problem, however, the real issue remains unfixed.

Sergey Bronnikov:
* added the description, the test for the problem and
  flavor with AVX512 to `exotic-builds-testing` workflow

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Thanks to Carlo Cabrera.

(cherry picked from commit b98b372)

Mach-O FAT object files generated by LuaJIT for ARM64 had
an incorrect format due to the usage of the 32-bit version of
the FFI structure. This patch adds the 64-bit structure definition
and uses it for ARM64.

Sergey Bronnikov:
* added the description and the test for the problem

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Kaplun <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
This patch adds Undefined Behaviour Sanitizer [1] support. It enables
all checks except several that are not useful for LuaJIT. Also, it
instruments all known issues to be fixed in future patches (except
`kfold_intop()` since cdata arithmetic relies on integer overflow).

[1]: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html

Resolves tarantool/tarantool#8473

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Also, this patch sorts the corresponding flags in the CI workflow
alphabetically for better readability.

Relates to tarantool/tarantool#8473

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
This patch is a follow-up to the
6e3aad8 ("OSX/iOS/ARM64: Fix generation
of Mach-O object files."). During the backporting of this patch, the
<build> directory was added to the git tree. This patch removes it and
adds the <build> directory to the .gitignore to avoid such mistakes in
the future.

Reviewed-by: Maxim Kokryashkin <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Thanks to Sergey Kaplun.

(cherry picked from commit 4a22050)

When saving FPR registers during while from a trace and restoring data
from a snapshot, UB sanitizer produces the following warning:
| lj_snap.c:804:32: runtime error: index 23 out of bounds for type 'intptr_t [16]'

due to indexing `ex->gpr` with a fpr register, whose number is >=
`RID_MAX_GPR`. The situation itself is harmless since this is read from
`spill[256]` array and is rewritten in the next if branch.

This patch fixes the out-of-bounds access to read from `ex->gpr` only
conditionally. Also, it removes the corresponding UBSAN suppression.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924
Relates to tarantool/tarantool#8473

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Reported by minoki.
Recent C compilers 'take advantage' of the undefined behavior.
This completely changes the meaning of expressions like (k == -k).

(cherry picked from commit 8a5e398)

This patch changes all possibly dangerous -x operations on integers to
the corresponding two's complement. Also, it removes all related UBSAN
suppressions, since they are fixed.

Also, this patch limits the `bit.tohex()` result by 254 characters.

There is no testcase for `strscan_oct()`, `strscan_dec()` or/and
`STRSCAN_U32` format since first the unary minus is parsed first and
only after the number itself is parsed during parsing C syntax. So the
error is raised in `cp_expr_prefix()` instead. For parsing the exponent
header, there is no testcase, since the power is limited by
`STRSCAN_MAXEXP`.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9924
Relates to tarantool/tarantool#8473

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Suggested by Sergey Kaplun.

(cherry picked from commit d2fe2a6)

This patch replaces the numeric value of NYI bytecodes that can't be
compiled with their names in the `jit.dump()` and -jv outputs. Since the
functionality is the same, only `jit.dump()` is tested as most popular.

Sergey Kaplun:
* added the description and the test for the feature

Part of tarantool/tarantool#9924

Reviewed-by: Maxim Kokryashkin <[email protected]>
Reviewed-by: Sergey Bronnikov <[email protected]>
Signed-off-by: Sergey Kaplun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants