Skip to content

Tags: ddemidov/vexcl

Tags

1.4.3

Toggle 1.4.3's commit message
1.4.3

* C++ OpenCL wrappers are now included via CL/opencl.hpp (recommended by
  Khronos) or CL/cl2.hpp (deprecated).
* Minor fixes

1.4.2

Toggle 1.4.2's commit message
1.4.2

* Two years worth of minor fixes and improvements.
* Added `source_generator::num_groups()` returning the number of
  workgroups on the compute device.
* Make `push_compile_options`, `push_program_header` behave in a cumulative way.
* Added `profiler::reset()`.
* Added `vector::at()`.
* Support mixed precision in `vex::copy()`.

1.4.1

Toggle 1.4.1's commit message
1.4.1

A bug fix release

* Improvements for cmake scripts
* Bug fixes

1.4.0

Toggle 1.4.0's commit message
Version 1.4.0

* Modernize cmake build system.
  Provide `VexCL::OpenCL`, `VexCL::Compute`, `VexCL::CUDA`, `VexCL::JIT`
  imported targets, so that users may just
  ```
  add_executable(myprogram myprogram.cpp)
  target_link_libraries(myprogram VexCL::OpenCL)
  ```
  to build a program using the corresponding VexCL backend.
  Also stop polluting global cmake namespace with things like
  `add_definitions()`, `include_directories()`, etc.
* Make `vex::backend::kernel::config()` return reference to the kernel.  So
  that it is possible to config and launch the kernel in a single line:
  `K.config(nblocks, nthreads)(queue, prm1, prm2, prm3);`.
* Implement `vector<T>::reinterpret<U>()` method.  It returns a new vector that
  reinterprets the same data (no copies are made) as the new type.
* Implemented new backend: JIT. The backend generates and compiles at runtime
  C++ kernels with OpenMP support. The code will not be more effective that
  hand-written OpenMP code, but allows to easily debug the generated code with
  host-side debugger. The backend also may be used to develop and test new code
  when other backends are not available.
* Let `VEX_CONSTANTS` to be casted to their values in the host code. So that a
  constant defined with `VEX_CONSTANT(name, expr)` could be used in host code
  as `name`. Constants are still useable in vector expressions as `name()`.
* Allow passing generated kernel args for each GPU (#202).
  Kernel args packed into std::vector will be unpacked and passed
  to the generated kernels on respective devices.
* Reimplemented `vex::SpMat` as `vex::sparse::ell`, `vex::sparse::crs`,
  `vex::sparse::matrix` (automatically chooses one of the two formats based on
  the current compute device), and `vex::sparse::distributed<format>` (this one
  may span several compute devices). The new matrix-vector products are now
  normal vector expressions, while the old `vex::SpMat` could only be used in
  additive expressions. The old implementation is still available.
  `vex::sparse::ell` is now converted from host-side CRS format on compute
  device, which makes the conversion faster.
* Bug fixes and minor improvements

1.3.3

Toggle 1.3.3's commit message
* Added vex::tensordot() operation.

  Given two tensors (arrays of dimension greater than or equal to one), A and
  B, and a list of axes pairs (where each pair represents corresponding
  axes from two tensors), sums the products of A's and B's elements over the
  given axes. Inspired by python's numpy.tensordot operation.
* Expose constant memory space in OpenCL backend.
* Provide shortcut filters vex::Filter::{CPU,GPU,Accelerator} for OpenCL backend.
* Added Boost.Compute backend. Core functionality of the Boost.Compute library
  is used as a replacement to Khronos C++ API which seems to become more and
  more outdated. The Boost.Compute backend is still based on OpenCL, so there
  are two OpenCL backends now.  Define VEXCL_BACKEND_COMPUTE to use this
  backend and make sure Boost.Compute headers are in include path.

1.3.2

Toggle 1.3.2's commit message
* Improved thread safety

* Implemented any_of and all_of primitives
* Minor bugfixes and improvements

1.3.1

Toggle 1.3.1's commit message
* Adopted scan_by_key algorithm from HSA-Libraries/Bolt

* Minor improvements and bug fixes

1.3.0

Toggle 1.3.0's commit message
* API breaking change: `vex::purge_kernel_caches()` family of functio…

…ns is

  renamed to `vex::purge_caches()` as the online cache now may hold objects of
  arbitrary type. The overloads that used to take
  `vex::backend::kernel_cache_key` now take `const vex::backend::command_queue&`.
* The online cache is now purged whenever `vex::Context` is destroyed. This
  allows for clean release of OpenCL/cuda contexts.
* Code for random number generators has been unified between OpenCL and CUDA
  backends.
* Fast Fourier Transform is now supported both for OpenCL and CUDA backends.
* `vex::backend::kernel` constructor now takes optional parameter with command
  line options.
* Performance of CLOGS algorithms has been improved.
* VEX_BUILTIN_FUNCTION macro has been made public.
* Minor bug fixes and improvements.

1.2.0

Toggle 1.2.0's commit message
* API breaking change: changed the definition of VEX_FUNCTION family …

…of macros.

  The previous versions are available as VEX_FUNCTION_V1.
* Wrapping code for [clogs](clogs.sourceforge.net) library is added by @bmerry
  (the author of clogs).
* vector/multivector iterators are now standard-conforming iterators.
* Other minor improvements and bug fixes.

1.1.2

Toggle 1.1.2's commit message
Fixed compilation with Visual Studio