Properly support the Intel compiler with the CMake build #68

mortenpi · 2021-06-02T05:27:44Z

(This only affects the CMake build)

Currently, we always pass -fno-automatic as a compiler flag, even if the user adds their own flags (by setting CMAKE_Fortran_FLAGS). This is a problem for e.g. ifort which has a different name for that flag.

With this change, if the user decides to customize the flags by passing their own CMAKE_Fortran_FLAGS, we no longer set -fno-automatic automatically, which solves that problem. The only thing to note though is that the user then needs to explicitly pass -fno-automatic.

Question to anyone who might know this: do we actually need -fno-automatic for GRASP? It changes the way SAVE attributes are handled.. but is there any part in GRASP that actually requires this flag?

Fix #68

CMakeLists.txt

jongrumer · 2021-06-02T08:47:43Z

I'm pretty sure that flag is (or at least was) needed, but don't remember on top of my head why.

cffischer · 2021-06-03T03:29:18Z

The -fno-automatic flag was needed because early FORTRAN codes always saved values when a routine was exited whereas F90 does not. I suspect the need is reduced but I am not sure it has been tested. Flags always depend on the compiler.

mortenpi · 2021-06-06T02:36:50Z

Alright, new approach (since checking whether the user has modified CMAKE_Fortran_FLAGS wasn't reliable):

We still automatically append -fno-automatic to CMAKE_Fortran_FLAGS if we detect that it's gfortran.
With ifort we append -save instead
Other compilers will print a warning and won't append anything automatically.
You can disable the automatic append completely by passing -DGRASP_DEFAULT_FLAGS=FALSE to cmake

@jongrumer could you check that this does the right thing in a live ifort environment?

jongrumer

Seems fine to me!

jongrumer · 2021-06-08T12:32:17Z

Ok - I know this should be in a separate PR, but to speed things up a bit - I added the mkdir fix we found in mpi90/sys_mkdir and also included the -mkl flag in the default ifortran flags in CMakeLists.txt to turn on MKL. With the new freely available ifort, now also including MPI and MKL (!), this is of course the way to do it if one is using ifort. Just make sure you install both the Base kit and the HPC kit (the former contains MKL and the latter includes the compiler and MPI). Just remove these two commits if you (@mortenpi) think this is completely out of line. Will be interesting to see if there are any speedups when running with just intel all the way. A quick test is given further below.

Intel Ifort + MPI/MKL (HPC) instructions for Linux: https://software.intel.com/content/www/us/en/develop/documentation/installation-guide-for-intel-oneapi-toolkits-linux/top/installation/install-using-package-managers/apt.html#apt_PACKAGES

Mac and Windows have to download executables (note that the Mac version does not seem to ship with MPI).

Compiling with Cmake, and using the new Intel Ifort API kits (Base + HPC), including the -mkl flag above via the addition to CMakeList.txt, I get the following linked libraries for e.g. rmcdhf-mpi. Seems sort of fine, but I'm not entirely sure why e.g. libgfortran.so.4 and openblas is still in there...needs further investigations.

ldd rmcdhf_mpi
	linux-vdso.so.1 (0x00007ffd2f969000)
	libmkl_intel_lp64.so.1 => /opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_intel_lp64.so.1 (0x0000151c141f3000)
	libmkl_intel_thread.so.1 => /opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_intel_thread.so.1 (0x0000151c108fe000)
	libmkl_core.so.1 => /opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_core.so.1 (0x0000151c07363000)
	libiomp5.so => /opt/intel/oneapi/compiler/2021.2.0/linux/compiler/lib/intel64_lin/libiomp5.so (0x0000151c06f4c000)
	libblas.so.3 => /usr/lib/x86_64-linux-gnu/libblas.so.3 (0x0000151c06cf1000)
	liblapack.so.3 => /usr/lib/x86_64-linux-gnu/liblapack.so.3 (0x0000151c0646b000)
	libmpifort.so.12 => /opt/intel/oneapi/mpi/2021.2.0//lib/libmpifort.so.12 (0x0000151c060ad000)
	libmpi.so.12 => /opt/intel/oneapi/mpi/2021.2.0//lib/release/libmpi.so.12 (0x0000151c04de7000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x0000151c04be3000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x0000151c049db000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x0000151c047bc000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x0000151c0441e000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000151c0402d000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x0000151c03e15000)
	/lib64/ld-linux-x86-64.so.2 (0x0000151c14f58000)
	libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0 (0x0000151c01b6f000)
	libgfortran.so.4 => /usr/lib/x86_64-linux-gnu/libgfortran.so.4 (0x0000151c01790000)
	libfabric.so.1 => /opt/intel/oneapi/mpi/2021.2.0//libfabric/lib/libfabric.so.1 (0x0000151c0154a000)
	libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x0000151c01303000)

And this is what it looks like for a gfortran/openMPI build (no surprises, GNU all the way)

ldd rmcdhf_mpi
	linux-vdso.so.1 (0x00007ffe37bd4000)
	libblas.so.3 => /usr/lib/x86_64-linux-gnu/libblas.so.3 (0x0000149863537000)
	liblapack.so.3 => /usr/lib/x86_64-linux-gnu/liblapack.so.3 (0x0000149862cb1000)
	libmpi_mpifh.so.20 => /usr/lib/x86_64-linux-gnu/libmpi_mpifh.so.20 (0x0000149862a5a000)
	libgfortran.so.4 => /usr/lib/x86_64-linux-gnu/libgfortran.so.4 (0x000014986267b000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00001498622dd000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00001498620c5000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000149861cd4000)
	libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0 (0x000014985fa2e000)
	libmpi.so.20 => /usr/lib/x86_64-linux-gnu/libmpi.so.20 (0x000014985f73c000)
	libopen-pal.so.20 => /usr/lib/x86_64-linux-gnu/libopen-pal.so.20 (0x000014985f48a000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x000014985f26b000)
	libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x000014985f024000)
	/lib64/ld-linux-x86-64.so.2 (0x0000149863a71000)
	libopen-rte.so.20 => /usr/lib/x86_64-linux-gnu/libopen-rte.so.20 (0x000014985ed9c000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x000014985eb94000)
	libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x000014985e957000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x000014985e753000)
	libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x000014985e550000)
	libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x000014985e345000)
	libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x000014985e13b000)

Quick test case -- IN PROGRESS!
A simple RMCDHF_MPI + RCI_MPI + TRANSITIONS_MPI (8 processes) run on OI 2p4 with SDT excitations from 2p4 only, first layer (3s,3p,3d,4f,5g,6h) with a reduction in the RMCDHF run and full list in RCI, gives with the two setups above the following timings (time stamps are given when the individual program is started + total exec time at the end).

ifort (-O3 -save -mkl) + MPI + MKL and using Intel mpirun
---------------
      LAYER: as1
 NEW SHELLS: 3s,3p,3d,4f,5g,6h
  OPTIMIZED: 3s* 3p* 3d* 4f* 5g* 6h*

 == Tue Jun  8 15:55:32 CEST 2021 == rcsfgenerate
 == Tue Jun  8 15:55:33 CEST 2021 == rangular
 == Tue Jun  8 15:55:33 CEST 2021 == rwfnestimate
 == Tue Jun  8 15:55:33 CEST 2021 == rmcdhf (Iteration number  11)
 == Tue Jun  8 15:55:35 CEST 2021 == rci
 == Tue Jun  8 15:55:46 CEST 2021 == jj2lsj
 == Tue Jun  8 15:55:47 CEST 2021 == rtransition
 == Tue Jun  8 15:55:56 CEST 2021 == done
 
Total Execution time - 0 hours 0 min 25 sec
 
gfortran-9 (-O3 -fno-automatic) + OpenMPI and using GNU mpirun
---------------------
      LAYER: as1
 NEW SHELLS: 3s,3p,3d,4f,5g,6h
  OPTIMIZED: 3s* 3p* 3d* 4f* 5g* 6h*

 == Tue Jun  8 15:50:11 CEST 2021 == rcsfgenerate + rcsfinteract
 == Tue Jun  8 15:50:12 CEST 2021 == rangular
 == Tue Jun  8 15:50:12 CEST 2021 == rwfnestimate
 == Tue Jun  8 15:50:12 CEST 2021 == rmcdhf (Iteration number  11)
 == Tue Jun  8 15:50:27 CEST 2021 == rci
 == Tue Jun  8 15:51:10 CEST 2021 == jj2lsj
 == Tue Jun  8 15:51:10 CEST 2021 == rtransition
 == Tue Jun  8 15:51:18 CEST 2021 == done
 
Total Execution time - 0 hours 1 min 8 sec

mortenpi · 2021-06-09T00:09:55Z

Ok, this is actually cool. With a proper Intel ifort+MKL+MPI installation, just doing

FC=ifort BLA_VENDOR=Intel10_64lp_seq ./configure.sh

seems to automatically configure a CMake build that uses MKL (via FindBLAS) and also links against the Intel MPI.

I am not quite sure that adding -mkl is the right way to go. If you don't specify BLA_VENDOR=Intel10_64lp_seq, FindBLAS will still try to link against the system OpenBLAS (if available). Maybe the more correct thing would be to set BLA_VENDOR if we detect the Intel compiler?

jongrumer · 2021-06-09T07:50:36Z

Ok great! I'll try setting BLA_VENDOR then, but seems unreasonably complicated...but we still need to set -mkl to make sure MKL is used also for all the other things, or what are your thoughts there? Just remembered that there might be an mklvars.sh that should be sourced...at least there used to be something like that.

EDIT: Seems like -mkl should be enough, at least if you properly sourced the source /opt/intel/oneapi/setvars.sh - https://software.intel.com/content/www/us/en/develop/articles/using-mkl-in-intel-compiler-mkl-qmkl-options.html

Allow overriding default compiler flag in CMake

2f31053

mortenpi requested a review from jongrumer June 2, 2021 05:28

mortenpi commented Jun 2, 2021

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

mortenpi marked this pull request as draft June 2, 2021 05:41

mortenpi added 2 commits June 6, 2021 14:31

Use new variable to control defaults: GRASP_DEFAULT_FLAGS

67c9f25

Edits

6fcd439

mortenpi marked this pull request as ready for review June 6, 2021 02:36

jongrumer approved these changes Jun 8, 2021

View reviewed changes

jongrumer added 2 commits June 8, 2021 14:25

revert to modern sys mkdir call

b061815

add -mkl to default ifort CMake flags

868616e

mortenpi changed the title ~~Allow overriding default compiler flag in CMake~~ Properly support the Intel compiler with the CMake build Jun 9, 2021

Merge branch 'master' into mp/cmake-options

db82cb1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Properly support the Intel compiler with the CMake build #68

Properly support the Intel compiler with the CMake build #68

mortenpi commented Jun 2, 2021 •

edited

Loading

jongrumer commented Jun 2, 2021

cffischer commented Jun 3, 2021

mortenpi commented Jun 6, 2021

jongrumer left a comment

jongrumer commented Jun 8, 2021 •

edited

Loading

mortenpi commented Jun 9, 2021 •

edited

Loading

jongrumer commented Jun 9, 2021 •

edited

Loading

Properly support the Intel compiler with the CMake build #68

Are you sure you want to change the base?

Properly support the Intel compiler with the CMake build #68

Conversation

mortenpi commented Jun 2, 2021 • edited Loading

jongrumer commented Jun 2, 2021

cffischer commented Jun 3, 2021

mortenpi commented Jun 6, 2021

jongrumer left a comment

Choose a reason for hiding this comment

jongrumer commented Jun 8, 2021 • edited Loading

mortenpi commented Jun 9, 2021 • edited Loading

jongrumer commented Jun 9, 2021 • edited Loading

mortenpi commented Jun 2, 2021 •

edited

Loading

jongrumer commented Jun 8, 2021 •

edited

Loading

mortenpi commented Jun 9, 2021 •

edited

Loading

jongrumer commented Jun 9, 2021 •

edited

Loading