Fix reduce_add, prefix_sum, and doc-build #593

yhmtsai · 2020-07-17T14:07:51Z

This PR fixes some issue in reduce_add and prefix_sum.

reduce_add use deallocated memory if size > default_block_size because we declare the array inside the if-block but use data pointer out of the if-block.
prefix_sum does not need to read the last element of the array (initcheck reports it) and compute the total sum of the last block.
Also, use components::prefix_sum not start_prefix_sum in sampleselect_count. otherwise, it gets the wrong result from this fix because the last block_sum will not include the last element.
start_prefix_sum uses size_type as shared_memory for index_type or size_type, which leads wrong result if some values are negative. This fix also makes it correct on floating point type.

minor fix: fix the doc-build

cuda/components/prefix_sum.cu

.gitlab-ci.yml

codecov · 2020-07-18T08:17:15Z

Codecov Report

Merging #593 into develop will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff            @@
##           develop     #593   +/-   ##
========================================
  Coverage    84.16%   84.16%           
========================================
  Files          296      296           
  Lines        20656    20656           
========================================
  Hits         17385    17385           
  Misses        3271     3271

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f27faf3...c8cdecd. Read the comment docs.

upsj

LGTM!

pratikvn

LGTM!

common/factorization/par_ilut_select_kernels.hpp.inc

1. use components prefix_sum in parilut 2. prefix_sum uses correct shared_memory value_type

pratikvn · 2020-07-24T06:44:26Z

Currently there is a problem with the documentation deployment as the GLOB picks up the build-setup.sh as well. I think if you add build-setup.sh to this line in doc/examples/CMakeLists.txt, removing it, everything should work again.

Or if you really hate GLOB, then you could add the specific examples list to the FILE capturing command. But I think GLOB is suited here as we dont really want to modify the list every time an example is added.

thoasm

LGTM, just a small documentation update would be nice to document the change.

common/components/prefix_sum.hpp.inc

doc/examples/CMakeLists.txt

Co-authored-by: Tobias Ribizel <[email protected]> Co-authored-by: Thomas Grützmacher <[email protected]>

Co-authored-by: Pratik Nayak <[email protected]>

sonarcloud · 2020-07-24T15:48:33Z

SonarCloud Quality Gate failed.

0 Bugs
0 Vulnerabilities (and 0 Security Hotspots to review)
2 Code Smells

No Coverage information
32.1% Duplication

The version of Java (1.8.0_121) you have used to run this analysis is deprecated and we will stop accepting it from October 2020. Please update to at least Java 11.
Read more here

Release 1.3.0 of Ginkgo. The Ginkgo team is proud to announce the new minor release of Ginkgo version 1.3.0. This release brings CUDA 11 support, changes the default C++ standard to be C++14 instead of C++11, adds a new Diagonal matrix format and capacity for diagonal extraction, significantly improves the CMake configuration output format, adds the Ginkgo paper which got accepted into the Journal of Open Source Software (JOSS), and fixes multiple issues. Supported systems and requirements: + For all platforms, cmake 3.9+ + Linux and MacOS + gcc: 5.3+, 6.3+, 7.3+, all versions after 8.1+ + clang: 3.9+ + Intel compiler: 2017+ + Apple LLVM: 8.0+ + CUDA module: CUDA 9.0+ + HIP module: ROCm 2.8+ + Windows + MinGW and Cygwin: gcc 5.3+, 6.3+, 7.3+, all versions after 8.1+ + Microsoft Visual Studio: VS 2017 15.7+ + CUDA module: CUDA 9.0+, Microsoft Visual Studio + OpenMP module: MinGW or Cygwin. The current known issues can be found in the [known issues page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues). Additions: + Add paper for Journal of Open Source Software (JOSS). [#479](#479) + Add a DiagonalExtractable interface. [#563](#563) + Add a new diagonal Matrix Format. [#580](#580) + Add Cuda11 support. [#603](#603) + Add information output after CMake configuration. [#610](#610) + Add a new preconditioner export example. [#595](#595) + Add a new cuda-memcheck CI job. [#592](#592) Changes: + Use unified memory in CUDA debug builds. [#621](#621) + Improve `BENCHMARKING.md` with more detailed info. [#619](#619) + Use C++14 standard instead of C++11. [#611](#611) + Update the Ampere sm information and CudaArchitectureSelector. [#588](#588) Fixes: + Fix documentation warnings and errors. [#624](#624) + Fix warnings for diagonal matrix format. [#622](#622) + Fix criterion factory parameters in CUDA. [#586](#586) + Fix the norm-type in the examples. [#612](#612) + Fix the WAW race in OpenMP is_sorted_by_column_index. [#617](#617) + Fix the example's exec_map by creating the executor only if requested. [#602](#602) + Fix some CMake warnings. [#614](#614) + Fix Windows building documentation. [#601](#601) + Warn when CXX and CUDA host compiler do not match. [#607](#607) + Fix reduce_add, prefix_sum, and doc-build. [#593](#593) + Fix find_library(cublas) issue on machines installing multiple cuda. [#591](#591) + Fix allocator in sellp read. [#589](#589) + Fix the CAS with HIP and NVIDIA backends. [#585](#585) Deletions: + Remove unused preconditioner parameter in LowerTrs. [#587](#587) Related PR: #625

The Ginkgo team is proud to announce the new minor release of Ginkgo version 1.3.0. This release brings CUDA 11 support, changes the default C++ standard to be C++14 instead of C++11, adds a new Diagonal matrix format and capacity for diagonal extraction, significantly improves the CMake configuration output format, adds the Ginkgo paper which got accepted into the Journal of Open Source Software (JOSS), and fixes multiple issues. Supported systems and requirements: + For all platforms, cmake 3.9+ + Linux and MacOS + gcc: 5.3+, 6.3+, 7.3+, all versions after 8.1+ + clang: 3.9+ + Intel compiler: 2017+ + Apple LLVM: 8.0+ + CUDA module: CUDA 9.0+ + HIP module: ROCm 2.8+ + Windows + MinGW and Cygwin: gcc 5.3+, 6.3+, 7.3+, all versions after 8.1+ + Microsoft Visual Studio: VS 2017 15.7+ + CUDA module: CUDA 9.0+, Microsoft Visual Studio + OpenMP module: MinGW or Cygwin. The current known issues can be found in the [known issues page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues). Additions: + Add paper for Journal of Open Source Software (JOSS). [#479](#479) + Add a DiagonalExtractable interface. [#563](#563) + Add a new diagonal Matrix Format. [#580](#580) + Add Cuda11 support. [#603](#603) + Add information output after CMake configuration. [#610](#610) + Add a new preconditioner export example. [#595](#595) + Add a new cuda-memcheck CI job. [#592](#592) Changes: + Use unified memory in CUDA debug builds. [#621](#621) + Improve `BENCHMARKING.md` with more detailed info. [#619](#619) + Use C++14 standard instead of C++11. [#611](#611) + Update the Ampere sm information and CudaArchitectureSelector. [#588](#588) Fixes: + Fix documentation warnings and errors. [#624](#624) + Fix warnings for diagonal matrix format. [#622](#622) + Fix criterion factory parameters in CUDA. [#586](#586) + Fix the norm-type in the examples. [#612](#612) + Fix the WAW race in OpenMP is_sorted_by_column_index. [#617](#617) + Fix the example's exec_map by creating the executor only if requested. [#602](#602) + Fix some CMake warnings. [#614](#614) + Fix Windows building documentation. [#601](#601) + Warn when CXX and CUDA host compiler do not match. [#607](#607) + Fix reduce_add, prefix_sum, and doc-build. [#593](#593) + Fix find_library(cublas) issue on machines installing multiple cuda. [#591](#591) + Fix allocator in sellp read. [#589](#589) + Fix the CAS with HIP and NVIDIA backends. [#585](#585) Deletions: + Remove unused preconditioner parameter in LowerTrs. [#587](#587) Related PR: #627

yhmtsai added is:bug Something looks wrong. mod:cuda This is related to the CUDA module. 1:ST:ready-for-review This PR is ready for review mod:hip This is related to the HIP module. labels Jul 17, 2020

yhmtsai requested review from upsj, pratikvn, hartwiganzt and fritzgoebel July 17, 2020 14:07

yhmtsai self-assigned this Jul 17, 2020

upsj reviewed Jul 17, 2020

View reviewed changes

cuda/components/prefix_sum.cu Outdated Show resolved Hide resolved

upsj reviewed Jul 17, 2020

View reviewed changes

.gitlab-ci.yml Outdated Show resolved Hide resolved

upsj reviewed Jul 17, 2020

View reviewed changes

.gitlab-ci.yml Outdated Show resolved Hide resolved

yhmtsai force-pushed the fix_reduceadd_and_prefixsum branch 2 times, most recently from 1b3b710 to de10d84 Compare July 21, 2020 08:39

upsj approved these changes Jul 22, 2020

View reviewed changes

pratikvn approved these changes Jul 22, 2020

View reviewed changes

common/factorization/par_ilut_select_kernels.hpp.inc Show resolved Hide resolved

yhmtsai added 2 commits July 23, 2020 17:15

fix prefix_sum and reduce_add issue

efc6796

fix parilut and prefix_sum issue

0c3815c

1. use components prefix_sum in parilut 2. prefix_sum uses correct shared_memory value_type

yhmtsai force-pushed the fix_reduceadd_and_prefixsum branch from de10d84 to fceb694 Compare July 23, 2020 15:15

upsj added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Jul 23, 2020

thoasm approved these changes Jul 24, 2020

View reviewed changes

common/components/prefix_sum.hpp.inc Show resolved Hide resolved

upsj reviewed Jul 24, 2020

View reviewed changes

doc/examples/CMakeLists.txt Outdated Show resolved Hide resolved

yhmtsai force-pushed the fix_reduceadd_and_prefixsum branch from 9c4a23f to 8bf3c70 Compare July 24, 2020 09:15

yhmtsai and others added 2 commits July 24, 2020 12:01

review update

27420f2

Co-authored-by: Tobias Ribizel <[email protected]> Co-authored-by: Thomas Grützmacher <[email protected]>

fix doc-build

c8cdecd

Co-authored-by: Pratik Nayak <[email protected]>

yhmtsai force-pushed the fix_reduceadd_and_prefixsum branch from 8bf3c70 to c8cdecd Compare July 24, 2020 10:01

yhmtsai changed the title ~~Fix reduce_add and prefix_sum~~ Fix reduce_add, prefix_sum, and doc-build Jul 24, 2020

yhmtsai merged commit 95e0ff5 into develop Jul 24, 2020

yhmtsai deleted the fix_reduceadd_and_prefixsum branch July 24, 2020 19:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix reduce_add, prefix_sum, and doc-build #593

Fix reduce_add, prefix_sum, and doc-build #593

yhmtsai commented Jul 17, 2020 •

edited

Loading

codecov bot commented Jul 18, 2020 •

edited

Loading

upsj left a comment

pratikvn left a comment

pratikvn commented Jul 24, 2020 •

edited

Loading

thoasm left a comment

sonarcloud bot commented Jul 24, 2020

Fix reduce_add, prefix_sum, and doc-build #593

Fix reduce_add, prefix_sum, and doc-build #593

Conversation

yhmtsai commented Jul 17, 2020 • edited Loading

codecov bot commented Jul 18, 2020 • edited Loading

Codecov Report

upsj left a comment

Choose a reason for hiding this comment

pratikvn left a comment

Choose a reason for hiding this comment

pratikvn commented Jul 24, 2020 • edited Loading

thoasm left a comment

Choose a reason for hiding this comment

sonarcloud bot commented Jul 24, 2020

yhmtsai commented Jul 17, 2020 •

edited

Loading

codecov bot commented Jul 18, 2020 •

edited

Loading

pratikvn commented Jul 24, 2020 •

edited

Loading