Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to support parallel NetCDF I/O? #122

Open
ali-ramadhan opened this issue Feb 7, 2021 · 14 comments
Open

Is it possible to support parallel NetCDF I/O? #122

ali-ramadhan opened this issue Feb 7, 2021 · 14 comments

Comments

@ali-ramadhan
Copy link

ali-ramadhan commented Feb 7, 2021

I don't know much about the subject but from looking at the PnetCDF description (https://parallel-netcdf.github.io/) it sounds like there are two backend options for parallel I/O: PnetCDF and parallel HDF5?

It sounds like it might be possible to build NetCDF with parallel I/O support.

Out of curiousity, is parallel I/O something that NCDatasets.jl can feasibly support?

X-Ref: CliMA/Oceananigans.jl#590
X-Ref: CliMA/ClimateMachine.jl#2007

@Alexander-Barth
Copy link
Owner

Support for parallel NetCDF I/O would indeed be nice.
Do you have any complete known-working C code example?
I tried the Ubuntu 20.04 package libnetcdf-mpi-dev with nc4_pnc_put.c

But this fails with:

sudo apt-get install libnetcdf-mpi-dev
wget https://cucis.ece.northwestern.edu/projects/PnetCDF/Examples/nc4_pnc_put.c
gcc -o nc4_pnc_put -I/usr/lib/x86_64-linux-gnu/netcdf/mpi/include/ nc4_pnc_put.c -L/usr/lib/x86_64-linux-gnu/netcdf/mpi/ -lnetcdf_mpi -I/usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -I/usr/lib/x86_64-linux-gnu/openmpi/include -pthread -L/usr/lib/x86_64-linux-gnu/openmpi/lib -lmpi
mpiexec -n 2 ./nc4_pnc_put testfile.nc
# Error at line=100: NetCDF: Parallel operation on file opened for non-parallel access Aborting ...

It could be that I did something stupid.
I assume that the two package have the same API. Maybe this is not the case.

If don't know much on this subject either, but I somebody could provide a full C example (as opposed to code fragments) that would help a lot.

@ali-ramadhan
Copy link
Author

Thanks for looking into this. I thought I was going to start playing around with parallel I/O sooner but still working on basic MPI infrastructure...

I can try getting nc4_pnc_put.c to run once I start looking at parallel NetCDF I/O.

@rafaqz
Copy link
Contributor

rafaqz commented Jun 18, 2021

Any luck with this @ali-ramadhan? netcdf could also be a sink for parallel DiskArrays/Dagger.jl processing, so this would be widely useful.

@kongdd
Copy link

kongdd commented Dec 8, 2022

call for this feature too

@johnomotani
Copy link

👀

@pgf
Copy link

pgf commented Apr 22, 2024

I'm just chiming in to let interested people know that I've been working on this task during the last week or so. I managed to produce a working example on my laptop over the weekend. My plan is to consolidate the code changes first and then write and execute a few meaningful tests on real HPC platforms over GPFS and (hopefully) Lustre parallel file systems. If everything goes well I will update you to discuss how to proceed, opening a PR or whatever.

Just a note about parallel netcdf3 support. As of now NetCDF_jll only supports parallel netcdf4. Support for parallel netcdf3 is provided through the parallel-netcdf library which is not enabled in NetCDF_jll and not even available on Yggdrasil. While this is not a problem for this specific development (trying to access a netcdf3 file using parallel I/O will simply throw a "not supported" error), I think it would be useful for the package to support also parallel netcdf3. I have no previous experience with JLL packages and Yggdrasil, but if someone manages to add parallel-netcdf to Yggdrasil and enable support for parallel netcdf3 in NetCDF_jll I will be happy to test that too.

@rafaqz
Copy link
Contributor

rafaqz commented Apr 22, 2024

JLL packages are not so hard to add with the wizard (see https://github.com/JuliaPackaging/BinaryBuilder.jl)

And it is very likely that you are the best situated person to do this currently, probably the 100 to 1 favorite.

And code that relies on manually installed system binaries will likely not be widely used. The julia ecosystem has moved very strongly towards versioned dependencies managed by Pkg.jl.

So, I encourage you to give it a go :)

But feel free to ping any JLL problems here for help/feedback.

@johnomotani
Copy link

And code that relies on manually installed system binaries will likely not be widely used. The julia ecosystem has moved very strongly towards versioned dependencies managed by Pkg.jl.

Just to chime in here (as a potential parallel NetCDF user!) - parallel NetCDF (or HDF5) is one case where system binaries are likely to be wanted. On an HPC cluster, we probably have to use the vendor-provided MPI to get the best performance (especially for inter-node communication), so the parallel NetCDF (and HDF5, which I mention as it'll be a dependency of parallel NetCDF for netcdf4 files, I assume) libraries will need to be linked to the system MPI, which the Julia-provided binaries will not be. At least, HPC users will want the option to do that, and I guess they are the main users of parallel NetCDF...

For comparison, see the setup for parallel HDF.jl, which provides a utility function to link to the system binaries:
https://juliaio.github.io/HDF5.jl/stable/mpi/#using_parallel_HDF5

@rafaqz
Copy link
Contributor

rafaqz commented Apr 22, 2024

Yes you're probably right generally, I was thinking of non-MPI use cases like Dagger.jl.

This is pretty nice syntax in HDF5 if we only have the system binaries:

HDF5.API.set_libraries!("/path/to/your/libhdf5.so", "/path/to/your/libhdf5_hl.so")

(but it would be very nice to have a JLL and "not even available on Yggdrasil" probably means there is no-one else to do it)

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Apr 23, 2024

There is some initial work in 70ef683 in the branch MPI.
For now I prioritize the current HDF5/NetCDF4 format and the Yggdrasil JLLs.

Using a custom netCDF library, potentially linking to an optimized MPI (and HDF5) library is possible using Preferences:

https://alexander-barth.github.io/NCDatasets.jl/stable/issues/#Using-a-custom-NetCDF-library

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Apr 24, 2024

Windows currently fails with (full logs), Linux and OS X do work ok:

NetCDF: Parallel operation on file opened for non-parallel access (NetCDF error code: -114 = NC_ENOPAR)
   nc_create_par(path::String, cmode::UInt16, mpi_comm::MPI.Comm, mpi_info::MPI.Info)
   NCDataset(comm::MPI.Comm, filename::String, mode::String; info::MPI.Info, format::Symbol, share::Bool, diskless::Bool, persist::Bool, maskingvalue::Missing, attrib::Vector{Any})
  in expression starting at D:\a\NCDatasets.jl\NCDatasets.jl\test\test_mpi_script.jl:18
  in expression starting at D:\a\NCDatasets.jl\NCDatasets.jl\test\test_mpi_netcdf.jl:10

Parallel support seem to be missing from the Windows NetCDF_jll
https://github.com/JuliaBinaryWrappers/NetCDF_jll.jl/releases/download/NetCDF-v400.902.211%2B0/NetCDF-logs.v400.902.211.x86_64-w64-mingw32-mpi+microsoftmpi.tar.gz

checking whether parallel io is enabled in hdf5... no
checking for library containing H5Dread_chunk... none required
checking for library containing H5Pset_fapl_ros3... none required
checking whether HDF5 allows parallel filters... yes
checking whether szlib was used when building HDF5... yes
checking whether HDF5 library is version 1.10.6 or later... yes
configure: WARNING: Parallel io disabled for netcdf-4 because hdf5 does not support
checking whether parallel I/O is enabled for netcdf-4... no
# Features
--------
Benchmarks:		no
NetCDF-2 API:		yes
HDF4 Support:		no
HDF5 Support:		yes
NetCDF-4 API:		yes
CDF5 Support:		yes
NC-4 Parallel Support:	no
PnetCDF Support:	no

It seems that upsteam netcdf-c is not testing MPI on Windows (MSYS2 , mingw)
https://github.com/Unidata/netcdf-c/actions/runs/8745640065/job/24001010636)

# Features
--------
Benchmarks:		no
NetCDF-2 API:		yes
HDF4 Support:		no
HDF5 Support:		yes
NetCDF-4 API:		yes
CDF5 Support:		yes
NC-4 Parallel Support:	no
PnetCDF Support:	no

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Apr 24, 2024

It is not clear if HDF5_jll has actually MPI enabled on Windows:

https://github.com/JuliaBinaryWrappers/HDF5_jll.jl/releases/download/HDF5-v1.14.3%2B3/HDF5-logs.v1.14.3.x86_64-w64-mingw32-libgfortran3-cxx03-mpi+microsoftmpi.tar.gz

Features:
---------
                     Parallel HDF5: no
  Parallel Filtered Dataset Writes: no
                Large Parallel I/O: no

If somebody with an interest in Windows can have a look at this, this would be awesome :-).

@pgf
Copy link

pgf commented Apr 24, 2024

There is some initial work in 70ef683 in the branch MPI.

Is it new? I don't remember seeing it when I checked last week.
Anyway, it's almost identical to my version, except for few details. For example I called the access method paraccess to be more explicit, but that's fine.

A couple of notes:

  • the access dataset method works only with netcdf3 files, while it throws an error with netcdf4 files because nc_var_par_access doesn't recognize NC_GLOBAL as a valid variable ID. AFAIK when nc_var_par_access is called on a variable in a netcdf3 file it sets the access mode globally for the file (see here). I think there's no need for a dataset method, the variable method already does the same with netcdf3 files. Indeed, I too added the dataset method at first, but later I changed my mind.

  • I think that the MPI communicator can be an optional argument for most of the cases, so the user just need to ask for parallel access. I defined the NCDataset method this way:

    function NCDataset(filename::AbstractString,
                    mode::AbstractString = "r";
                    format::Symbol = :netcdf4,
                    parallel::Bool = false,
                    comm::MPI.Comm = MPI.COMM_WORLD,
                    info::MPI.Info = MPI.INFO_NULL,
                    ...

    I think this gives a cleaner call (similar to the python netcdf4 package):

    ds = NCDataset(path,"c",parallel=true)

    leaving comm and info for more specific use cases.

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Apr 25, 2024

Yes, this is new. I started to work on this only a couple of days ago.
I did not know that you also worked on this.

Thank you for your close look at these changes.
Yes, I think paraccess is better (I was indeed looking for a better name as access is probably to generic).

I considered also to make the communicator (or parallel) a keyword argument. But as far as I know, this would mean that MPI becomes a (hard) dependency of NCDatasets as we cannot dispatch on keyword arguments. I would think that netCDF with MPI makes a very good use case of a weak dependencies.

Currently MPI is the only way to have parallel access to netCDF files. For me MPI does not work so nice (or at all :-)) for interactive sessions. But maybe in future there will be other ways to do parallel access (threads, julia workers?) which all could be extensions onto which we could dispatch.

In mpi4py all MPI functions are methods of the communicator. So having the MPI communicator as the first argument of NCDatasets as main argument for dispatch seems not too surprising for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants