vlmarkov / Fault-Tolerance-Library Star 5 Code Issues Pull requests MPI user-level checkpoint library fault-tolerance incremental mpi particles checkpoint recovery redundancy laplace-equation nbody-simulation delta-encoding ulfm fault-tolerance-library jacobi-iteration Updated Jul 11, 2020 C++
upperwal / EntangledMPI Star 2 Code Issues Pull requests Fault Tolerance framework for High Performance Computing [Supports ULFM, replication and checkpointing] replication fault-tolerance mpi checkpoint checkpoint-restart fault-injection ulfm Updated Aug 31, 2018 C
lukashuebner / ft-raxml-ng Star 1 Code Issues Pull requests RAxML-ng is able to handle hardware failures when run with a failure mitigating MPI implementation (e.g. ULFM). ulfm raxml-ng Updated Feb 29, 2024 C++