Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Revert "Revert "Merge branch '364-add-large-indices-support-to-device-select' into 'develop_stream'" (#334)" This reverts commit 1dc3b58. * [autotune] Add configurable size based fallback * Fixes from codereview * Use more extendible format suggested in review * Use benchmark_manger to store the paths * [autotune][skip ci] Don't use global state * [autotune][skip ci] Renamed in json datatype_size_fallback --> fallback_cases * [autotune][skip ci] _create_fallback_case Make the _craeate_fallback_case function more easy to read by extracting the finding of the best entry by datatype into a separate function * [autotune][skip ci]Add minor changes from codereview * [autotune][skip ci]Use TextIOWrapper instead of strings for paths * [autotune][skip ci] Correct erronous help message * Resolve "Implement the block merge kernel using the merge paths" * Use correct warp size in default scan config for gfx908 * Update .gitlab-ci.yml file * Remove order only dependency of cmake-latest from cmake minimum * Update .clang-format [CI skip] * Add code format check script and git hook [CI skip] The script by default checks diffs for code formatting violations compared to the default branch (develop_stream). The commit hook checks the current commit. It can be installed by running the `install` script in .githooks. * code format check: Use proper git config key when setting color mode [CI skip] * use regular check-format script instead of special ci script * Use rules instead of only/if in gitlab-ci.yml * Set dependencies for remaining build jobs * Resolve "Fix rocThrust#280" * remove manual split, add cmake functions * introduce semi-automatic split of longest test compilation units * fix build issue when config tuning is enabled * applied matrix generation to several device algorithm benchmarks remove filter in benchmark config tuning * move parameters as much as possible to cmake script * Fix check-format output being truncated, increase robustness The output of git-clang-format is now saved to a temporary file to bypass maximum shell command limitations which resulted in truncating the output. Additionally the script has been made more robust to unexpected exits by using a signal handler to perform clean-up (deleting the file, and resetting the git settings) * Added custom configs to tests of device segmented radix sort * Updated configuration * 3-way partitioning in device segmented radix sort * Benchmark device segmented sort seg. length 256 * Reverse iterator [skip ci] * Device segmented radix sort uses reverse iterator * Updated changelog * Applied formatting to changes * Removed dangling semi-colon * segmented radix sort: fall back to 2-way partitioning if redundant * Resolve "Small performance improvement device_merge for small datatypes" * add reverse iterator to changelog * Fixing device partition / select / unique * Fix out of bound access in test_reverse_iterator The vector has 5 elements accessing index 5 is also out of bounds when using a reverse iterator. Co-authored-by: Lőrinc Serfőző <[email protected]> Co-authored-by: Balint Soproni <[email protected]> Co-authored-by: Vince van Heertum <[email protected]> Co-authored-by: Istvan Kiss <[email protected]> Co-authored-by: Robin Voetter <[email protected]> Co-authored-by: Gergely Mészáros <[email protected]>
- Loading branch information