Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verilator simulation (RocketConfig default example) crashes on MacOS #1886

Open
3 tasks done
Frank-Zeyda opened this issue May 28, 2024 · 4 comments
Open
3 tasks done
Labels

Comments

@Frank-Zeyda
Copy link

Frank-Zeyda commented May 28, 2024

Background Work

Chipyard Version and Hash

Main/HEAD
Commit: 3a6677b

OS Setup

MacBook Air running Sonoma 14.5 (Apple M2).

Output of clang --version is:

Apple clang version 15.0.0 (clang-1500.3.9.4)
Target: arm64-apple-darwin23.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

(I am using clang for compilation, not any version of gcc.)

All software is up-to-date as of 27 May 2024. The RISC-V toolchain and tools have been installed from Homebrew:

==> Formulae
riscv-software-src/riscv/riscv-gnu-toolchain ✔
riscv-software-src/riscv/riscv-isa-sim ✔
riscv-software-src/riscv/riscv-openocd ✔
riscv-software-src/riscv/riscv-pk ✔
riscv-software-src/riscv/riscv-tools ✔

I am using Verilator 5.024 2024-04-05 rev UNKNOWN.REV albeit with a slight hack to get around Issue 5031 - yes, it is currently troublesome to get verilator working under the latest Sonoma and Clang updates on MacOS 😒.

Other Setup

Step 0: Install RISC-V tools via Homebrew (see OS Setup section). Also, install gnu-sed, jq and help2man. Possible other packages that escape me now ...

Step 1: Setting up environment variables. I am using my own handwritten script here due to installing RISC-V tools and toolchain from Homebrew and using a custom-compiled verilator.

# Avoids issues due to different options for sed on MacOS.
# Requires gnu-sed to be installed via homebrew.
export PATH="/opt/homebrew/opt/gnu-sed/libexec/gnubin:$PATH"

# Use RISC-V toolchain installed via homebrew.
# Requires riscv-gnu-toolchain to be installed via homebrew.
export RISCV="/opt/homebrew/opt/riscv-gnu-toolchain/bin"

# Sets additional include paths
export CPPFLAGS="-I/opt/homebrew/opt/riscv-isa-sim/include"

# Need to link verilator simulation executable.
export LDFLAGS="-L/opt/homebrew/opt/riscv-isa-sim/lib"

Step 2: Patch common-sim-flags.mk by removing -std=c++17 from SIM_CXXFLAGS. Long story short, we need to compile verilator code with -std=c++20 in order to circumvent Issue 5031 . My patched verilator does that, but this is for a different issue.

Step 3: Enter sims/verilator and run make.

cd sims/verilators
make

Step 4: Run the verilator simulation executable. I.e.,

./simulator-chipyard.harness-RocketConfig rv64ui-p-simple`

assuming that rv64ui-p-simple (from riscv-tests is located in the current directory.

Current Behavior

On my system, this results in an immediate Segmentation Fault.

./simulator-chipyard.harness-RocketConfig $RISCV_TESTS/rv64ui-p-simple
[UART] UART0 is here (stdin/stdout).
zsh: segmentation fault  ./simulator-chipyard.harness-RocketConfig $RISCV_TESTS/rv64ui-p-simple

Attached is a relevant extract from the MacOS Diagnostic report.

image

Expected Behavior

Perform simulation and do not crash.

Other Information

I have already debugged and somewhat fixed the issue. The culprit, as can be seen from the stack trace, is

generators/src/main/resources/testchipip/csrc/SimTSI.cc

The code:

  *in_valid = tsi->in_valid();
  *in_bits = tsi->in_bits();
  *out_ready = tsi->out_ready();

(see https://github.com/ucb-bar/testchipip/blob/104df6a81fd989cd4cad69b699894664fcf93c05/src/main/resources/testchipip/csrc/SimTSI.cc#L59-L61) is unsafe. tsi->in_bits(); implicitly calls the std::queue::front() method/function whose behavior is undefined if the queue is empty. Hence, replace the above, for instance, by

  *in_valid = tsi->in_valid();
  *in_bits = *in_valid ? tsi->in_bits() : 0;
  *out_ready = tsi->out_ready();

to avoid the crash. (The above code is not thread-safe though, perhaps it does not need to be ...)

Note: I have not tried if this is an issue under Ubuntu/Linux. My suspicion is that there, the program would just continue with some arbitrary value for *in_bits. Semantically, anything can happen once we enter undefined behavior, so the above fix is advised either way!

PS: I apologize for just realizing that this bug report may have been better submitted in the testchipip repository.

@jerryz123
Copy link
Contributor

Thanks for debugging this! I think your diagnosis is correct. Can you open a PR with the fix in testchipip?

@Frank-Zeyda
Copy link
Author

Frank-Zeyda commented May 28, 2024

Thanks Jerry for taking care of this.

There are potentially two ways to fix the issue:

  *in_valid = tsi->in_valid();
  *in_bits = *in_valid ? tsi->in_bits() : 0;
  *out_ready = tsi->out_ready();

(as noted above) or else

  *in_valid = tsi->in_valid();
  if (*in_valid) {
    *in_bits = tsi->in_bits();
  }
  *out_ready = tsi->out_ready();

A key assumption for both fixes is that the underlying queue in tsi does not change while the three statements execute. Otherwise, we may end up with a race condition and need to wrap the code into a mutex.

Which of the two fixes is more appropriate depends on the assumed specification of the int tsi_tick(...) function in SimTSI.cc. I'd gladly leave those details to you, and thanks once again or the quick reply! 👍

@jerryz123
Copy link
Contributor

I previously assumed that all the tick functions would be called from the same thread, which is true for all simulation environments EXCEPT verilator with multithreading, as #1885 is uncovering.

However, I believe inheritors of htif_t (including tsi_t) are generally not expected to be thread-safe, so the first fix should be ok.

@a0u
Copy link
Member

a0u commented Jul 29, 2024

I have not tried if this is an issue under Ubuntu/Linux. My suspicion is that there, the program would just continue with some arbitrary value for *in_bits.

The same issue triggers a failed assertion on Gentoo Linux (glibc 2.40, verilator 5.024).

/usr/lib/gcc/x86_64-pc-linux-gnu/14/include/g++-v14/bits/stl_deque.h:1446: std::deque<_Tp, _Alloc>::reference std::deque<_Tp, _Alloc>::front() [with _Tp = unsigned int; _Alloc = std::allocator<unsigned int>; reference = unsigned int&]: Assertion '!this->empty()' failed.

Thread 1 "simulator-chipy" received signal SIGABRT, Aborted.
0x00007ffff70bd8cc in ?? () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff70bd8cc in ?? () from /lib64/libc.so.6
#1  0x00007ffff70685a6 in raise () from /lib64/libc.so.6
#2  0x00007ffff70508fa in abort () from /lib64/libc.so.6
#3  0x00007ffff72dac1f in std::__glibcxx_assert_fail(char const*, int, char const*, char const*) () from /usr/lib/gcc/x86_64-pc-linux-gnu/14/libstdc++.so.6
#4  0x00005555556229c5 in tsi_tick ()
#5  0x0000555555b13629 in VTestDriver___024unit____Vdpiimwrap_tsi_tick_TOP____024unit(unsigned int, unsigned char, unsigned char&, unsigned int, unsigned char&, unsigned char, unsigned int&, unsigned int&) ()
#6  0x0000555555a1ff72 in VTestDriver___024root___nba_sequent__TOP__4680(VTestDriver___024root*) ()
#7  0x0000555555672985 in VTestDriver___024root___eval_nba__11(VTestDriver___024root*) ()
#8  0x000055555565f6b1 in VTestDriver___024root___eval_nba(VTestDriver___024root*) ()
#9  0x0000555555672f12 in VTestDriver___024root___eval(VTestDriver___024root*) ()
#10 0x000055555565f50a in VTestDriver::eval_step() ()
#11 0x000055555561fcb8 in main ()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants