Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal instruction (core dumped) #55

Open
Davidyao99 opened this issue Aug 10, 2024 · 3 comments
Open

Illegal instruction (core dumped) #55

Davidyao99 opened this issue Aug 10, 2024 · 3 comments

Comments

@Davidyao99
Copy link

Davidyao99 commented Aug 10, 2024

I am trying to run glomap on cluster using singularity image and am facing the following error when running it on the sample gerrald-hall image with glomap mapper command.

I0810 06:28:25.486933 14316 relpose_estimation.cc:24] Estimating relative pose for 1317 pairs
 Estimating relative pose: 0%*** Aborted at 1723289305 (unix time) try "date -d @1723289305" if you are using GNU date ***
PC: @                0x0 (unknown)
*** SIGILL (@0x5614bc45ffb0) received by PID 14316 (TID 0x2b2c7bf256c0) from PID 18446744072573288368; stack trace: ***
    @     0x2b2c7a355046 (unknown)
    @     0x2b2c7a82b520 (unknown)
    @     0x5614bc45ffb0 poselib::Camera::focal()
    @     0x5614bc4639ac poselib::estimate_relative_pose()
    @     0x5614bc1ead9c _ZN6glomap21EstimateRelativePosesERNS_9ViewGraphERSt13unordered_mapIjNS_6CameraESt4hashIjESt8equal_toIjESaISt4pairIKjS3_EEERS2_IjNS_5ImageES5_S7_SaIS8_IS9_SE_EEERKNS_29RelativePoseEstimationOptionsE._omp_fn.0
    @     0x2b2c7a47ea16 GOMP_parallel
    @     0x5614bc1ea899 glomap::EstimateRelativePoses()
    @     0x5614bc18b901 glomap::GlobalMapper::Solve()
    @     0x5614bc18250c glomap::RunMapper()
    @     0x5614bc17edc5 main
    @     0x2b2c7a812d90 (unknown)
    @     0x2b2c7a812e40 __libc_start_main
    @     0x5614bc180bb5 _start
Illegal instruction (core dumped)

Following the docker image here, I created my own singularity image. However, due to some complications, I decided to not build colmap, and instead install it with conda environment. Any ideas on how should I resolve this? Some possible reasons for the issues are:

  1. I am using colmap that is installed using conda env instead of building it from source

  2. I am building this image on a cluster that does not have a gui

This is my singularity build file:

# This could also be another Ubuntu or Debian based distribution
BootStrap:docker
From: nvidia/cuda:11.7.1-base-ubuntu22.04

# Install dependencies
%post
export QT_XCB_GL_INTEGRATION=xcb_egl
export DEBIAN_FRONTEND=noninteractive

apt-get update && apt-get install --no-install-recommends -y \
git \
build-essential \
cmake \
ninja-build \
wget \
unzip \
libboost-program-options-dev \
libboost-filesystem-dev \
libboost-graph-dev \
libboost-system-dev \
libeigen3-dev \
libsuitesparse-dev \
libceres-dev \
libflann-dev \
libfreeimage-dev \
libmetis-dev \
libgoogle-glog-dev \
libgtest-dev \
libsqlite3-dev \
libglew-dev \
qtbase5-dev \
libqt5opengl5-dev \
libcgal-dev \
libcgal-qt5-dev \
libgl1-mesa-dri \
libunwind-dev \
xvfb \
clang-format-14 \
python3 \
python3-pip && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

wget https://github.com/Kitware/CMake/releases/download/v3.30.1/cmake-3.30.1-linux-x86_64.sh && \
chmod +x cmake-3.30.1-linux-x86_64.sh && \
./cmake-3.30.1-linux-x86_64.sh --skip-license --prefix=/usr/local

# Set up compiler environment
apt-get update && \
apt-get install -y \
clang-15 \
libomp-15-dev \
gcc-10 \
g++-10 \
nvidia-cuda-toolkit \
nvidia-cuda-toolkit-gcc && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
export CC=/usr/bin/gcc-10
export CXX=/usr/bin/g++-10
export CUDAHOSTCXX=/usr/bin/g++-10

# Build and install GLOMAP
git clone https://github.com/colmap/glomap.git && \
cd glomap && \
git fetch https://github.com/colmap/glomap.git main && \
git checkout FETCH_HEAD && \
mkdir build && \
cd build && \
cmake .. \
   -GNinja \
   -DCMAKE_BUILD_TYPE=Release \
   -DCMAKE_INSTALL_PREFIX=/glomap_installed \
   -DCMAKE_CUDA_ARCHITECTURES=86 \
   -DSuiteSparse_CHOLMOD_LIBRARY="/usr/lib/x86_64-linux-gnu/libcholmod.so" \
   -DSuiteSparse_CHOLMOD_INCLUDE_DIR="/usr/include/suitesparse" \
   -DTESTS_ENABLED=ON \
   -DASAN_ENABLED=false && \
ninja install
cp -r /glomap_installed/* /usr/local/

%environment
export CC=/usr/bin/gcc-10
export CXX=/usr/bin/g++-10
export CUDAHOSTCXX=/usr/bin/g++-10
@ahojnnes
Copy link
Contributor

Are you compiling on a different machine than where you run the binaries?

Up until a few days ago, PoseLib enabled -march=native flags by default, which will create problems when redistributing binaries. Based on my feedback, this was changed here: PoseLib/PoseLib@d406c08.

We have an open PR in GLOMAP to consume those latest changes. Meanwhile, you can manually update the git commit hash for poselib in the cmake/FindDependencies.cmake file.

@Davidyao99
Copy link
Author

Thanks for the prompt response, I changed the commit hash in FindDependencies.cmake and rebuilt the image, but the issue still exists.

I believe it is compiled on a different machine since I am building the image remotely, before importing the image and running it on the cluster node.

@ahojnnes
Copy link
Contributor

What camera model are you using? I am wondering whether it is caused by PoseLib not yet supporting the camera model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants