Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScaledReordered interface #1059

Merged
merged 9 commits into from
Nov 3, 2022
Merged

ScaledReordered interface #1059

merged 9 commits into from
Nov 3, 2022

Conversation

fritzgoebel
Copy link
Collaborator

This PR adds an interface for wrapping an inner operator (like a direct or iterative solver) into a (for now symmetrical, e.g. RCM) reordering via a reordering factory and diagonal scaling via one row and one column scaling diagonal matrix.

The ScaledReordered LinOp applies the given scaling and reordering to the system matrix and generates the inner operator on the modified problem. It manages three vectors, a right-hand side and a solution vector for the inner operator, and an intermediate vector used for the permutations.

With a permutation matrix P, a row scaling matrix R, and a column matrix C, the system matrix and vectors are scaled and permuted such that instead of the original system of equations
A*x = b
the inner operator solves
P*R*A*C*P^T*y = P*R*b.
Hence, when generating a ScaledReordered LinOp, the new system matrix P*R*A*C*P^T is computed. When it is applied to vectors b and x, the right-hand side vector is modified to P*R*b, and the solution y to the inner system gives the solution x to the original system via x = C*P^T*y.
When using an inner operator that uses an initial guess, currently the initial guess for the inner system is computed by y_0 = P*C^(-1)*x_0, which is the same as using x_0 for the system P*R*A*x = P*R*b. For the inverse diagonal scaling here I added an inverse_apply to the matrix::Diagonal. I still need to add tests for that.

Any feedback is welcome, especially on ideas regarding how to handle algorithms like RCM that generate a reordering, but also be able to handle MC64 that generates both a reordering and scaling coefficients and simpler equilibration techniques that only generate scaling coefficients.

@fritzgoebel fritzgoebel added is:new-feature A request or implementation of a feature that does not exist yet. mod:core This is related to the core module. 1:ST:WIP This PR is a work in progress. Not ready for review. 1:ST:need-feedback The PR is somewhat ready but feedback on a blocking topic is required before a proper review. labels Jun 21, 2022
@fritzgoebel fritzgoebel requested a review from a team June 21, 2022 16:59
@fritzgoebel fritzgoebel self-assigned this Jun 21, 2022
@ginkgo-bot ginkgo-bot added mod:all This touches all Ginkgo modules. reg:build This is related to the build system. reg:testing This is related to testing. type:matrix-format This is related to the Matrix formats type:reordering This is related to the matrix(LinOp) reordering labels Jun 21, 2022
@fritzgoebel
Copy link
Collaborator Author

format!

@fritzgoebel
Copy link
Collaborator Author

format!

Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about removing the ValueType and IndexType from the ScaledReordered.
set_cache_to you can use create_with_config_of to create cache.
some requirement of ValueType / IndexType can be done by run
Will it use ScaledReordered even when only need scale or reorder part?
Not a Scaled and Reordered class to handle them separately and ScaledReordered inherit from Scaled and Reordered.

GKO_ASSERT_EQUAL_DIMENSIONS(parameters_.row_scaling,
system_matrix_);
row_scaling_ = parameters_.row_scaling;
row_scaling_->apply(system_matrix_.get(), system_matrix_.get());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is dangerous in my opinion. it also destroys the __restrict__ property for GPU.
You can add the in place operation when detecting the input and output are the same pointer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#988 <---
for some LinOps, this is perfectly safe - in-place scaling is one of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at least in my imagine about what compiler works. TBH, I wouldn't like to challenge the compiler did the same thing as imagine.

Comment on lines 202 to 204
cache_.inner_b->copy_from(b);
cache_.inner_x->copy_from(x);
cache_.intermediate->copy_from(b);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to copy the data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply_uses_initial_guess()?

omp/test/reorder/rcm_kernels.cpp Show resolved Hide resolved
include/ginkgo/core/matrix/diagonal.hpp Show resolved Hide resolved
@upsj upsj requested review from a team and upsj July 12, 2022 08:21
@upsj upsj linked an issue Jul 13, 2022 that may be closed by this pull request
@fritzgoebel fritzgoebel added 1:ST:ready-for-review This PR is ready for review and removed 1:ST:WIP This PR is a work in progress. Not ready for review. labels Jul 18, 2022
@tcojean tcojean added the 1:ST:high-importance This issue/PR is of high importance and must be addressed as soon as possible. label Aug 23, 2022
@tcojean tcojean added this to the Ginkgo 1.5.0 milestone Aug 24, 2022
@pratikvn pratikvn self-requested a review September 19, 2022 12:49
Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. LGTM!

#include <ginkgo/core/reorder/reordering_base.hpp>


namespace gko {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fits better in the reorder/ folder rather than in base/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, before merging this we should agree on naming and placement.

* inner operator is applied to the system matrix P*R*A*C*P^T instead of A,
* so the instead of A*x = b the inner operator attempts to solve the equivalent
* linear system P*R*A*C*P^T*y = P*R*b and retrieves the solution x = C*P^T*y.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe note that the input matrix is cloned and not modified.

LinOp* x) const override;

/**
* Prepares the intermediate right hand side, solution and intermediateb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Prepares the intermediate right hand side, solution and intermediateb
* Prepares the intermediate right hand side, solution and intermediate

include/ginkgo/ginkgo.hpp Outdated Show resolved Hide resolved
Comment on lines 64 to 69
virtual std::shared_ptr<const gko::LinOp> get_permutation() const = 0;

virtual std::shared_ptr<const gko::LinOp> get_inverse_permutation()
const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to have these getters here ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the scaled reordered factory takes a ReorderingBaseFactory, I think these are necessary here rather than only in the algorithm implemenations.

@yhmtsai
Copy link
Member

yhmtsai commented Sep 26, 2022

AFAIR, combing scaled and reordered together allows some algorithms uses only one operation.
how do you do that when a new LinOp?

@fritzgoebel
Copy link
Collaborator Author

AFAIR, combing scaled and reordered together allows some algorithms uses only one operation. how do you do that when a new LinOp?

At the moment, not at all. So far the operations are not merged yet.

@yhmtsai
Copy link
Member

yhmtsai commented Oct 5, 2022

will the merged operations be in this class or another class inherited from this class?
if there's another class, the clone will only give the clone of ScaledReordered class not itself.

Copy link
Member

@thoasm thoasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an interface break at the moment (the only reason why I don't approve this PR).
Unfortunately, it will be quite difficult to find a solution that does not break the existing interface while being intuitive to use (for example, adding get_permutation_2 would not break the interface, but be quite confusing for the user).
Additionally, the actual reordering inside the constructor of ScaledReordered needs a bit more work.

core/base/reordered.cpp Outdated Show resolved Hide resolved
include/ginkgo/core/base/reordered.hpp Outdated Show resolved Hide resolved
include/ginkgo/core/reorder/reordering_base.hpp Outdated Show resolved Hide resolved
include/ginkgo/core/reorder/rcm.hpp Outdated Show resolved Hide resolved
Comment on lines 187 to 188
SR::build().with_row_scaling(this->diag).on(this->exec);
auto scaled_reordered = scaled_reordered_fact->generate(this->mtx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (for AAA separation):

Suggested change
SR::build().with_row_scaling(this->diag).on(this->exec);
auto scaled_reordered = scaled_reordered_fact->generate(this->mtx);
SR::build().with_row_scaling(this->diag).on(this->exec);
auto scaled_reordered = scaled_reordered_fact->generate(this->mtx);

Comment on lines 277 to 279
auto scaled_reordered = scaled_reordered_fact->generate(this->mtx);

auto before_system_matrix = scaled_reordered->get_system_matrix();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Suggested change
auto scaled_reordered = scaled_reordered_fact->generate(this->mtx);
auto before_system_matrix = scaled_reordered->get_system_matrix();
auto scaled_reordered = scaled_reordered_fact->generate(this->mtx);
auto before_system_matrix = scaled_reordered->get_system_matrix();

Comment on lines 299 to 301
auto scaled_reordered = scaled_reordered_fact->generate(this->mtx);

auto before_system_matrix = scaled_reordered->get_system_matrix();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here:

Suggested change
auto scaled_reordered = scaled_reordered_fact->generate(this->mtx);
auto before_system_matrix = scaled_reordered->get_system_matrix();
auto scaled_reordered = scaled_reordered_fact->generate(this->mtx);
auto before_system_matrix = scaled_reordered->get_system_matrix();

Comment on lines 157 to 165
auto reordering = parameters_.reordering->generate(system_matrix_);
auto perm = reordering->get_permutation();
permutation_ =
clone(exec, as<matrix::Permutation<index_type>>(perm));
auto permutation_array =
make_array_view(exec, permutation_->get_size()[0],
permutation_->get_permutation());
system_matrix_ = as<Permutable<index_type>>(system_matrix_)
->permute(&permutation_array);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires your reordering to return a matrix::Permutation. However, this is not documented and not intuitive (your ReorderingBaseFactory needs get_permutation() to return a matrix::Permutation).
Is there a reason why you don't call the apply method on the permutation matrix? That would allow for more formats than just matrix::Permutation and would work a lot better with your new interface.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with an array now, that works for all kinds of permutations and does not assume we have a permutation matrix.

include/ginkgo/ginkgo.hpp Outdated Show resolved Hide resolved
include/ginkgo/core/base/reordered.hpp Outdated Show resolved Hide resolved
include/ginkgo/core/base/reordered.hpp Outdated Show resolved Hide resolved
Comment on lines 228 to 233
cache_struct() = default;
~cache_struct() = default;
cache_struct(const cache_struct&) {}
cache_struct(cache_struct&&) {}
cache_struct& operator=(const cache_struct&) { return *this; }
cache_struct& operator=(cache_struct&&) { return *this; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need one newline between these function

include/ginkgo/core/reorder/rcm.hpp Outdated Show resolved Hide resolved
void set_cache_to(const LinOp* b, const LinOp* x) const
{
if (cache_.inner_b == nullptr) {
const auto size = b->get_size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also need to check the size. When the size is different, create the new cache

@fritzgoebel
Copy link
Collaborator Author

format!

Copy link
Member

@thoasm thoasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvements. I have only minor requests, after addressing these, you can feel free and merge it.

include/ginkgo/core/reorder/reordering_base.hpp Outdated Show resolved Hide resolved
reference/test/reorder/scaled_reordered.cpp Outdated Show resolved Hide resolved
include/ginkgo/core/reorder/scaled_reordered.hpp Outdated Show resolved Hide resolved
include/ginkgo/core/reorder/scaled_reordered.hpp Outdated Show resolved Hide resolved
Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for what we can currently change - I think there is still a lot of work to be done in the Permutation, Diagonal and ScaledPermutation area, but that needs to happen in Ginkgo 2.0

Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. I have discussed with Fritz about the combining the permutation and scaling. It will change the implementation not a new class, so my above worry will not happen.

I put it is request changes because it may give error when inner_operator is nullptr (generated Identity)

Comment on lines 242 to 247
std::unique_ptr<matrix::Dense<value_type>> inner_b{};

std::unique_ptr<matrix::Dense<value_type>> inner_x{};

std::unique_ptr<matrix::Dense<value_type>> intermediate{};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::unique_ptr<matrix::Dense<value_type>> inner_b{};
std::unique_ptr<matrix::Dense<value_type>> inner_x{};
std::unique_ptr<matrix::Dense<value_type>> intermediate{};
std::unique_ptr<matrix::Dense<value_type>> inner_b{};
std::unique_ptr<matrix::Dense<value_type>> inner_x{};
std::unique_ptr<matrix::Dense<value_type>> intermediate{};

nit: these are data

inner_operator_ =
parameters_.inner_operator->generate(system_matrix_);
} else {
inner_operator_ = gko::matrix::Identity<value_type>::create(exec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to put the size, apply will check the size with b, x

reference/test/reorder/scaled_reordered.cpp Show resolved Hide resolved
Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. two concern: 1. public interface from using alias 2. move the get_permutation_array into ReorderBase.

public:
using index_type = IndexType;

virtual const array<index_type>& get_permutation_array() const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also provide the setter for permutation_array in protected and the permutation data in private.
without implement it in each class

/**
* Declares an Abstract Factory specialized for ReorderingBases
*/
using ReorderingBaseFactory =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public interface?

Copy link
Collaborator Author

@fritzgoebel fritzgoebel Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the IndexType to these two. But I think technically, the second one is still an interface break since in case someone used the PolymorphicBase, this would no longer work...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine if you break the interface at that point as I don't think a lot of people (my assumption is 0 outside of us) were using it. If it is compiled with c++17, the interface is no longer broken, so I would consider it as acceptable.

@fritzgoebel
Copy link
Collaborator Author

Depends on #1171 now.

@sonarcloud
Copy link

sonarcloud bot commented Nov 3, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 18 Code Smells

90.0% 90.0% Coverage
3.1% 3.1% Duplication

@fritzgoebel
Copy link
Collaborator Author

Ready to merge after #1171

@ginkgo-bot
Copy link
Member

Note: This PR changes the Ginkgo ABI:

Functions changes summary: 166 Removed, 168 Changed (256 filtered out), 1445 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

For details check the full ABI diff under Artifacts here

@fritzgoebel fritzgoebel merged commit 1fd207a into develop Nov 3, 2022
@fritzgoebel fritzgoebel deleted the reordered_interface branch November 3, 2022 16:36
tcojean added a commit that referenced this pull request Nov 12, 2022
Advertise release 1.5.0 and last changes

+ Add changelog,
+ Update third party libraries
+ A small fix to a CMake file

See PR: #1195

The Ginkgo team is proud to announce the new Ginkgo minor release 1.5.0. This release brings many important new features such as:
- MPI-based multi-node support for all matrix formats and most solvers;
- full DPC++/SYCL support,
- functionality and interface for GPU-resident sparse direct solvers,
- an interface for wrapping solvers with scaling and reordering applied,
- a new algebraic Multigrid solver/preconditioner,
- improved mixed-precision support,
- support for device matrix assembly,

and much more.

If you face an issue, please first check our [known issues page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues) and the [open issues list](https://github.com/ginkgo-project/ginkgo/issues) and if you do not find a solution, feel free to [open a new issue](https://github.com/ginkgo-project/ginkgo/issues/new/choose) or ask a question using the [github discussions](https://github.com/ginkgo-project/ginkgo/discussions).

Supported systems and requirements:
+ For all platforms, CMake 3.13+
+ C++14 compliant compiler
+ Linux and macOS
  + GCC: 5.5+
  + clang: 3.9+
  + Intel compiler: 2018+
  + Apple LLVM: 8.0+
  + NVHPC: 22.7+
  + Cray Compiler: 14.0.1+
  + CUDA module: CUDA 9.2+ or NVHPC 22.7+
  + HIP module: ROCm 4.0+
  + DPC++ module: Intel OneAPI 2021.3 with oneMKL and oneDPL. Set the CXX compiler to `dpcpp`.
+ Windows
  + MinGW and Cygwin: GCC 5.5+
  + Microsoft Visual Studio: VS 2019
  + CUDA module: CUDA 9.2+, Microsoft Visual Studio
  + OpenMP module: MinGW or Cygwin.


Algorithm and important feature additions:
+ Add MPI-based multi-node for all matrix formats and solvers (except GMRES and IDR). ([#676](#676), [#908](#908), [#909](#909), [#932](#932), [#951](#951), [#961](#961), [#971](#971), [#976](#976), [#985](#985), [#1007](#1007), [#1030](#1030), [#1054](#1054), [#1100](#1100), [#1148](#1148))
+ Porting the remaining algorithms (preconditioners like ISAI, Jacobi, Multigrid, ParILU(T) and ParIC(T)) to DPC++/SYCL, update to SYCL 2020, and improve support and performance ([#896](#896), [#924](#924), [#928](#928), [#929](#929), [#933](#933), [#943](#943), [#960](#960), [#1057](#1057), [#1110](#1110),  [#1142](#1142))
+ Add a Sparse Direct interface supporting GPU-resident numerical LU factorization, symbolic Cholesky factorization, improved triangular solvers, and more ([#957](#957), [#1058](#1058), [#1072](#1072), [#1082](#1082))
+ Add a ScaleReordered interface that can wrap solvers and automatically apply reorderings and scalings ([#1059](#1059))
+ Add a Multigrid solver and improve the aggregation based PGM coarsening scheme ([#542](#542), [#913](#913), [#980](#980), [#982](#982),  [#986](#986))
+ Add infrastructure for unified, lambda-based, backend agnostic, kernels and utilize it for some simple kernels ([#833](#833), [#910](#910), [#926](#926))
+ Merge different CUDA, HIP, DPC++ and OpenMP tests under a common interface ([#904](#904), [#973](#973), [#1044](#1044), [#1117](#1117))
+ Add a device_matrix_data type for device-side matrix assembly ([#886](#886), [#963](#963), [#965](#965))
+ Add support for mixed real/complex BLAS operations ([#864](#864))
+ Add a FFT LinOp for all but DPC++/SYCL ([#701](#701))
+ Add FBCSR support for NVIDIA and AMD GPUs and CPUs with OpenMP ([#775](#775))
+ Add CSR scaling ([#848](#848))
+ Add array::const_view and equivalent to create constant matrices from non-const data ([#890](#890))
+ Add a RowGatherer LinOp supporting mixed precision to gather dense matrix rows ([#901](#901))
+ Add mixed precision SparsityCsr SpMV support ([#970](#970))
+ Allow creating CSR submatrix including from (possibly discontinuous) index sets ([#885](#885), [#964](#964))
+ Add a scaled identity addition (M <- aI + bM) feature interface and impls for Csr and Dense ([#942](#942))


Deprecations and important changes:
+ Deprecate AmgxPgm in favor of the new Pgm name. ([#1149](#1149)).
+ Deprecate specialized residual norm classes in favor of a common `ResidualNorm` class ([#1101](#1101))
+ Deprecate CamelCase non-polymorphic types in favor of snake_case versions (like array, machine_topology, uninitialized_array, index_set) ([#1031](#1031), [#1052](#1052))
+ Bug fix: restrict gko::share to rvalue references (*possible interface break*) ([#1020](#1020))
+ Bug fix: when using cuSPARSE's triangular solvers, specifying the factory parameter `num_rhs` is now required when solving for more than one right-hand side, otherwise an exception is thrown ([#1184](#1184)).
+ Drop official support for old CUDA < 9.2 ([#887](#887))


Improved performance additions:
+ Reuse tmp storage in reductions in solvers and add a mutable workspace to all solvers ([#1013](#1013), [#1028](#1028))
+ Add HIP unsafe atomic option for AMD ([#1091](#1091))
+ Prefer vendor implementations for Dense dot, conj_dot and norm2 when available ([#967](#967)).
+ Tuned OpenMP SellP, COO, and ELL SpMV kernels for a small number of RHS ([#809](#809))


Fixes:
+ Fix various compilation warnings ([#1076](#1076), [#1183](#1183), [#1189](#1189))
+ Fix issues with hwloc-related tests ([#1074](#1074))
+ Fix include headers for GCC 12 ([#1071](#1071))
+ Fix for simple-solver-logging example ([#1066](#1066))
+ Fix for potential memory leak in Logger ([#1056](#1056))
+ Fix logging of mixin classes ([#1037](#1037))
+ Improve value semantics for LinOp types, like moved-from state in cross-executor copy/clones ([#753](#753))
+ Fix some matrix SpMV and conversion corner cases ([#905](#905), [#978](#978))
+ Fix uninitialized data ([#958](#958))
+ Fix CUDA version requirement for cusparseSpSM ([#953](#953))
+ Fix several issues within bash-script ([#1016](#1016))
+ Fixes for `NVHPC` compiler support ([#1194](#1194))


Other additions:
+ Simplify and properly name GMRES kernels ([#861](#861))
+ Improve pkg-config support for non-CMake libraries ([#923](#923), [#1109](#1109))
+ Improve gdb pretty printer ([#987](#987), [#1114](#1114))
+ Add a logger highlighting inefficient allocation and copy patterns ([#1035](#1035))
+ Improved and optimized test random matrix generation ([#954](#954), [#1032](#1032))
+ Better CSR strategy defaults ([#969](#969))
+ Add `move_from` to `PolymorphicObject` ([#997](#997))
+ Remove unnecessary device_guard usage ([#956](#956))
+ Improvements to the generic accessor for mixed-precision ([#727](#727))
+ Add a naive lower triangular solver implementation for CUDA ([#764](#764))
+ Add support for int64 indices from CUDA 11 onward with SpMV and SpGEMM ([#897](#897))
+ Add a L1 norm implementation ([#900](#900))
+ Add reduce_add for arrays ([#831](#831))
+ Add utility to simplify Dense View creation from an existing Dense vector ([#1136](#1136)).
+ Add a custom transpose implementation for Fbcsr and Csr transpose for unsupported vendor types ([#1123](#1123))
+ Make IDR random initilization deterministic ([#1116](#1116))
+ Move the algorithm choice for triangular solvers from Csr::strategy_type to a factory parameter ([#1088](#1088))
+ Update CUDA archCoresPerSM ([#1175](#1116))
+ Add kernels for Csr sparsity pattern lookup ([#994](#994))
+ Differentiate between structural and numerical zeros in Ell/Sellp ([#1027](#1027))
+ Add a binary IO format for matrix data ([#984](#984))
+ Add a tuple zip_iterator implementation ([#966](#966))
+ Simplify kernel stubs and declarations ([#888](#888))
+ Simplify GKO_REGISTER_OPERATION with lambdas ([#859](#859))
+ Simplify copy to device in tests and examples ([#863](#863))
+ More verbose output to array assertions ([#858](#858))
+ Allow parallel compilation for Jacobi kernels ([#871](#871))
+ Change clang-format pointer alignment to left ([#872](#872))
+ Various improvements and fixes to the benchmarking framework ([#750](#750), [#759](#759), [#870](#870), [#911](#911), [#1033](#1033), [#1137](#1137))
+ Various documentation improvements ([#892](#892), [#921](#921), [#950](#950), [#977](#977), [#1021](#1021), [#1068](#1068), [#1069](#1069), [#1080](#1080), [#1081](#1081), [#1108](#1108), [#1153](#1153), [#1154](#1154))
+ Various CI improvements ([#868](#868), [#874](#874), [#884](#884), [#889](#889), [#899](#899), [#903](#903),  [#922](#922), [#925](#925), [#930](#930), [#936](#936), [#937](#937), [#958](#958), [#882](#882), [#1011](#1011), [#1015](#1015), [#989](#989), [#1039](#1039), [#1042](#1042), [#1067](#1067), [#1073](#1073), [#1075](#1075), [#1083](#1083), [#1084](#1084), [#1085](#1085), [#1139](#1139), [#1178](#1178), [#1187](#1187))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:high-importance This issue/PR is of high importance and must be addressed as soon as possible. 1:ST:need-feedback The PR is somewhat ready but feedback on a blocking topic is required before a proper review. 1:ST:ready-for-review This PR is ready for review 1:ST:ready-to-merge This PR is ready to merge. 1:ST:run-full-test is:new-feature A request or implementation of a feature that does not exist yet. mod:all This touches all Ginkgo modules. mod:core This is related to the core module. reg:build This is related to the build system. reg:testing This is related to testing. type:matrix-format This is related to the Matrix formats type:reordering This is related to the matrix(LinOp) reordering
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve reordering interface
7 participants