Tags: ddemidov/vexcl
Tags
1.4.2 * Two years worth of minor fixes and improvements. * Added `source_generator::num_groups()` returning the number of workgroups on the compute device. * Make `push_compile_options`, `push_program_header` behave in a cumulative way. * Added `profiler::reset()`. * Added `vector::at()`. * Support mixed precision in `vex::copy()`.
Version 1.4.0 * Modernize cmake build system. Provide `VexCL::OpenCL`, `VexCL::Compute`, `VexCL::CUDA`, `VexCL::JIT` imported targets, so that users may just ``` add_executable(myprogram myprogram.cpp) target_link_libraries(myprogram VexCL::OpenCL) ``` to build a program using the corresponding VexCL backend. Also stop polluting global cmake namespace with things like `add_definitions()`, `include_directories()`, etc. * Make `vex::backend::kernel::config()` return reference to the kernel. So that it is possible to config and launch the kernel in a single line: `K.config(nblocks, nthreads)(queue, prm1, prm2, prm3);`. * Implement `vector<T>::reinterpret<U>()` method. It returns a new vector that reinterprets the same data (no copies are made) as the new type. * Implemented new backend: JIT. The backend generates and compiles at runtime C++ kernels with OpenMP support. The code will not be more effective that hand-written OpenMP code, but allows to easily debug the generated code with host-side debugger. The backend also may be used to develop and test new code when other backends are not available. * Let `VEX_CONSTANTS` to be casted to their values in the host code. So that a constant defined with `VEX_CONSTANT(name, expr)` could be used in host code as `name`. Constants are still useable in vector expressions as `name()`. * Allow passing generated kernel args for each GPU (#202). Kernel args packed into std::vector will be unpacked and passed to the generated kernels on respective devices. * Reimplemented `vex::SpMat` as `vex::sparse::ell`, `vex::sparse::crs`, `vex::sparse::matrix` (automatically chooses one of the two formats based on the current compute device), and `vex::sparse::distributed<format>` (this one may span several compute devices). The new matrix-vector products are now normal vector expressions, while the old `vex::SpMat` could only be used in additive expressions. The old implementation is still available. `vex::sparse::ell` is now converted from host-side CRS format on compute device, which makes the conversion faster. * Bug fixes and minor improvements
* Added vex::tensordot() operation. Given two tensors (arrays of dimension greater than or equal to one), A and B, and a list of axes pairs (where each pair represents corresponding axes from two tensors), sums the products of A's and B's elements over the given axes. Inspired by python's numpy.tensordot operation. * Expose constant memory space in OpenCL backend. * Provide shortcut filters vex::Filter::{CPU,GPU,Accelerator} for OpenCL backend. * Added Boost.Compute backend. Core functionality of the Boost.Compute library is used as a replacement to Khronos C++ API which seems to become more and more outdated. The Boost.Compute backend is still based on OpenCL, so there are two OpenCL backends now. Define VEXCL_BACKEND_COMPUTE to use this backend and make sure Boost.Compute headers are in include path.
* API breaking change: `vex::purge_kernel_caches()` family of functio… …ns is renamed to `vex::purge_caches()` as the online cache now may hold objects of arbitrary type. The overloads that used to take `vex::backend::kernel_cache_key` now take `const vex::backend::command_queue&`. * The online cache is now purged whenever `vex::Context` is destroyed. This allows for clean release of OpenCL/cuda contexts. * Code for random number generators has been unified between OpenCL and CUDA backends. * Fast Fourier Transform is now supported both for OpenCL and CUDA backends. * `vex::backend::kernel` constructor now takes optional parameter with command line options. * Performance of CLOGS algorithms has been improved. * VEX_BUILTIN_FUNCTION macro has been made public. * Minor bug fixes and improvements.
* API breaking change: changed the definition of VEX_FUNCTION family … …of macros. The previous versions are available as VEX_FUNCTION_V1. * Wrapping code for [clogs](clogs.sourceforge.net) library is added by @bmerry (the author of clogs). * vector/multivector iterators are now standard-conforming iterators. * Other minor improvements and bug fixes.
PreviousNext