Skip to content

Releases: intel/llvm

oneAPI DPC++ Compiler dependencies

24 Oct 06:39
cbdee7a
Compare
Choose a tag to compare

This release contains OpenCL RT for Intel CPU and FPGA emulator used for oneAPI DPC++ Compiler and runtime validation
Please, see the runtime installation guide here.

oneAPI DPC++ Compiler dependencies

21 Jun 05:25
a55b0b8
Compare
Choose a tag to compare
Pre-release

This release contains OpenCL RT for Intel CPU and FPGA emulator used for oneAPI DPC++ Compiler and runtime validation
Please, see the runtime installation guide here.

oneAPI DPC++ Compiler dependencies

03 Apr 07:04
ec3d9ee
Compare
Choose a tag to compare
Pre-release

This release contains OpenCL RT for Intel CPU and FPGA emulator used for oneAPI DPC++ Compiler and runtime validation
Please, see the runtime installation guide here.

oneAPI DPC++ Compiler dependencies

17 Nov 08:29
08907a3
Compare
Choose a tag to compare
Pre-release

This release contains OpenCL RT for Intel CPU and FPGA emulator used for oneAPI DPC++ Compiler and runtime validation
Please, see the runtime installation guide here.

oneAPI DPC++ Compiler dependencies

07 Jul 07:20
663042b
Compare
Choose a tag to compare

This release contains OpenCL RT for Intel CPU and FPGA emulator used for oneAPI DPC++ Compiler and runtime validation
Please, see the runtime installation guide here.

oneAPI DPC++ Compiler dependencies

29 Mar 07:46
cb91c23
Compare
Choose a tag to compare

This release contains OpenCL RT for Intel CPU and FPGA emulator used for oneAPI DPC++ Compiler and runtime validation
Please, see the runtime installation guide here.

oneAPI DPC++ Compiler 2022-12

08 Feb 07:14
6977f1a
Compare
Choose a tag to compare

New features

SYCL Compiler

SYCL Library

  • Implemented accessor member functions swap, byte_size, max_size and
    empty. [f1f907a]
  • Implemented SYCL 2020 default accessor constructor. [04928f9]
  • Implemented SYCL 2020 accessor iterators. [5b9fd3c] [c7b1a00]
  • Changed value_type of read-only accessors to const in accordance with
    SYCL 2020. [227614c]
  • Implemented SYCL 2020 multi_ptr and address_space_cast. [8700b76]
    [483984a] [4a9e9a0]
  • Implemented SYCL 2020 has_extension free functions. [7f1a6ef]
  • Implemented SYCL 2020 aspect_selector. [c0a4a56]
  • Implemented new SYCL 2020 style FPGA selectors. [0417651]
  • Implemented SYCL 2020 default async_handler behavior. [cd93d8f]
  • Implemented SYCL 2020 is_compatible free function. [67f6bba]
  • Implemented queue shortcut functions with placeholder accessors. [5ee066e]
  • Added support for creating a kernel bundle with descendent devices of the
    passed context's members. [a782779]
  • Implemented non-blocking destruction and deferred release of memory objects
    without attached host memory. [894ce25]
  • Implemented the sycl_ext_oneapi_queue_priority
    extension. [cdb09dc]
  • Implemented the sycl_ext_oneapi_user_defined_reductions
    extension. [8311d79]
  • Implemented the sycl_ext_oneapi_queue_empty
    extension proposal. [c493295]
  • Implemented the sycl_ext_oneapi_weak_object
    extension. [d948427] [9297f63]
  • Implemented the sycl_ext_intel_cslice
    extension. The old behavior that exposed compute slices as sub-sub-devices is
    now deprecated. For compatibility purposes, it can be brought back via the
    SYCL_PI_LEVEL_ZERO_EXPOSE_CSLICE_IN_AFFINITY_PARTITIONING environment
    varible. [5995c618]
  • Implemented the sycl_ext_intel_queue_index
    extension. [d2ec964] [7179e83]
  • Implemented the sycl_ext_oneapi_memcpy2d
    extension. [516d411]
  • Implemented device ID, memory clock rate and bus width information queries
    from the sycl_ext_intel_device_info
    extension. [1d99344] [4f7787c]
  • Implemented ext::oneapi::experimental::radix_sorter from the
    sycl_ext_oneapi_group_sort
    extension proposal. [86ba180]
  • Implemented a new unified interface for the sycl_ext_oneapi_matrix
    extension for CUDA. [166bbc3]
  • Added support for sorting over sub-groups. [168767c]
  • Added C++ API wrappers for the Intel math functions ceil, floor, rint,
    sqrt, rsqrt and trunc. [1b7582b]
  • Implemented a SYCL device library for bfloat16 Intel math function
    utilities. [fc136d6]
  • Added support for range reductions with any number of reduction variables.
    [572bc50]
  • Added support for reductions with kernels accepting item. [5d5e9f4]
  • Enabled sub-group masks for 64-bit subgroups. [10d50ed]
  • Implemented the new non-experimental API for DPAS. [55bf1a0] [1e7a8ea]
  • Added 8/16-bit type support to lsc_block_load and lsc_block_store ESIMD
    API. [f9d8059]
  • Implemented atomic operation support in the ESIMD emulator. [a6a0dea]
  • Added various trivial utility functions for the half type. [b4ce7c0]
  • Added type cast functions between half and float/integer types to
    libdevice. [599b1b9]
  • Implemented the ONEAPI_DEVICE_SELECTOR environment variable that, in
    addition to supporting SYCL_DEVICE_FILTER syntax, allows to expose GPU
    sub-devices as SYCL root devices and supports negative filters.
    SYCL_DEVICE_FILTER is now deprecated. [28d0cd3] [b21e74e] [77b6f34]
    [6bd5f9c] [6aefd63]
  • Added the SYCL_PI_LEVEL_ZERO_SINGLE_ROOT_DEVICE_BUFFER_MIGRATION
    enviornment variable. [bd03e0d]

Documentation

Improvements

SYCL Compiler

  • Added the InferAddressSpaces pass to the SPIR/SPIR-V compilation pipeline,
    reducing the size of the generated device code. [a3ae0dd]
  • Redesigned pointer handling so that it no longer decomposes kernel argument
    types containing pointers. [3916d3b] [d55e9c2] [9b02506]
  • Kernel lambda operator is now always inlined in the device code entry point
    unless -O0 is used. [b91b732] [2359d94]
  • Improved entry point handling in the sycl-post-link tool. [53d9c7b]
  • The reqd_work_group_size attribute now works with 1, 2 or 3 operands.
    [4ff42c3]
  • Enabled using -fcf-protection option with -fsycl, which results in it
    being applied only to host code compilation and producing a warning. [b6f61f6]
  • Linux based compiler driver on Windows now pulls in the sycld debug library
    when msvcrtd is specified as a dependent library. [ebf6c59]
  • Added /Zc:__cplusplus as a default option during host compilation with MSVC.
    [e7ed860]
  • Improved the ESIMDOptimizeVecArgCallConv optimization pass to cover more IR
    patterns. [4926454]
  • Added support for more types in ESIMD lsc functions. [d9e40ec]
  • Added error diagnostics for using
    sycl::ext::oneapi::experimental::annotated_arg/ptr as a nested type.
    [321c733]
  • The status of bfloat16 support was changed from experimental to supported.
    [7b47ebb]

SYCL Library

  • Updated online_compiler with Gen12 GPU support. [adfb1c1]
  • get_kernel_bundle and has_kernel_bundle now check that the kernels are
    compatible with the devices. [91b1515]
  • Waiting for an event associated with a kernel that uses a stream now also
    waits for the stream to be flushed. [1db0e81]
  • Added the requested device type to the message of the exception thrown when no
    such devices are found. [6b83ad7]
  • Optimized operator[] of host_accessor. [01e60f7]
  • Improved reduction performance on discrete GPUs. [99bdc82]
  • Added invoke_simd support for functions with void return type. [3fd0850]
  • The Level Zero plugin now creates every event as host-visible by default.
    [f3d245d]
  • Added Level Zero plugin support for global work sizes greater than
    UINT32_MAX as long as they are divisible by some legal work-group size and
    the resulting quotient does not exce...
Read more

oneAPI DPC++ Compiler dependencies

09 Dec 07:13
c103a6a
Compare
Choose a tag to compare
Pre-release

This release contains OpenCL RT for Intel CPU and FPGA emulator used for oneAPI DPC++ Compiler and runtime validation
Please, see the runtime installation guide here.

oneAPI DPC++ Compiler 2022-09

21 Oct 06:23
0f579ba
Compare
Choose a tag to compare

New features

SYCL Compiler

  • Added ability to enforce stateless memory accesses for ESIMD. [1811162]
  • Added support for -fsycl-force-target compiler option. [1d95f2e]
  • Added support for [[intel::max_reinvocation_delay]] loop attribute. [90fa5bb]
  • Added support for -fsycl-huge-device-code compiler option, which allows
    linking object files larger than 2GB. [f963062]
  • Added support for compiling .cu files with SYCL compiler. [e76ad72]
  • Added support for assert on HIP backend. [ade1870]
  • Enabled CXX standard library functions for CUDA backend. [1fe92c5]
  • Implemented group collective built-in functions for more integral types. [d4933b6]

SYCL Library

  • Implemented SYCL 2020 callable device selectors. [64f0db7]
  • Implemented SYCL 2020 standalone device selectors. [bfc7e98]
  • Added SYCL 2020 property interfaces for local_accessor, usm_allocator,
    accessor and host_accessor classes. [1136b40] [da7dcf8]
  • Added support for fpga_simulator_selector. [9bef890]
  • Added support for local_accessor. Deprecated target::local. [e4423ef]
  • Added support for querying free device memory on Level Zero backend. [0eeef2b]
  • Added support for querying free device memory on CUDA and HIP backends. [436f0d8]
  • Implemented bfloat16 conversions from/to float for host. [2a383f1]
  • Added support for ext::oneapi::property::queue::discard_events to
    Level Zero PI plugin. [1372120]
  • Added lsc_atomic support on ESIMD emulator. [0c051a8]
  • Added dpas support on ESIMD emulator. [3d506a3]
  • Added C++ API for imf libdevice built-ins. [830916a]
  • Implemented make_queue for CUDA backend. [89460e8]
  • Implemented has_native_event and make_event for CUDA backend. [74369c8]
  • Added support of CUDA XPTI tracing. [0cd0414]
  • Introduced predicates for ESIMD lsc_block_store/load. [f44edce]
  • Added experimental set_kernel_properties API and use_double_grf property
    for ESIMD. [9a55da5]
  • Added "eager initialization" mode to Level Zero PI plugin. It might result
    in an unnecessary work done by the plugin, but ensures the fastest possible
    execution on hot and reportable paths. [c145959]
  • Added full support of element wise operations on joint_matrix on CUDA
    backend including bfloat16 support. [0a1d751]
  • Implemented group::get_linear_id(int) method [6e83c12]

Documentation

Improvements

SYCL Library

  • Ensured that a correct errc thrown for an unassociated placeholder
    accessor. [4f9935a]
  • Removed dependency on OpenCL ICD Loader from the runtime. [90e8b5e]
  • Added support for ZEBIN format to persistent caching mechanism. [34dcf83]
  • Added identification mechanism for binaries in newer ZEBIN format. [f4dee54]
  • Switched to use struct information descriptors in accordance with SYCL 2020.
    Removed some deprecated information queries. [b3cbda5]
  • Updated kernel_device_specific::max_sub_group_size query to match SYCL 2020
    spec. Deprecated the old variant. [7842d05]
  • Deprecated SYCL 1.2.1 device selectors. [c058380]
  • Improved error messages reported for unsupported device partitioning. [1c9ddba]
  • Made device and platform default to default_selector_v. [b32dd41]
  • Deprecated address_space::constant_space. [351b123]
  • Marked sycl::exception::has_context as noexcept. [ad923c9]
  • Improved range reductions performance on CPU. [3323da6]
  • Made sycl::exception nothrow copy constructible. [289e33d]
  • Marked has_property methods as noexcept. [417b5a2]
  • Improved sycl::event::get_profiling_info exception message when event is
    default constructed. [2e86cd4]
  • Added a diagnostic (in form of static_assert) about kernel lambda size
    mismatch between host and device. [d278c67] [ec179b7] [f417a88]
  • Updated pipes class to throw exceptions if used on host. [eab2969]
  • Updated ESIMD Emulator PI plugin to report support for cl_khr_fp64
    extension. [398571a]
  • Updated Level Zero plugin to prefer copy engine for memory read/write
    operations. [65c3ea2]
  • Optimized some memory transfers. [92d35cd]
  • Enabled event caching in Level Zero PI plugin. [a41b33c]
  • Optimized some reductions with parallel_for accepting sycl::range
    for discrete GPUs. [c22a5d3]
  • Improved performance of event synchronization on CUDA backend. [c4f326a]
  • Added ability to use descendent devices of context members within that
    context. Not supported with OpenCL backend yet. [a0c8c50] [78a483c]
  • Added support for querying atomic64 device capability with HIP backend. [cb190fc]
  • Enabled FTZ operations for CUDA/PTX backend via
    -fcuda-flush-denormals-to-zero. [e8e7ae8]
  • Improved error message about incorrect kernel argument types with CUDA backend. [2542e6a]
  • Limited allowed argument types for rol/ror ESIMD functions to better
    represent HW capabilities. [b05f256]
  • Implemented mem_advise reset and managed memory checks for CUDA backend. [fe18839]
  • Added concurrent memory check to mem_advise on CUDA backend. [33746d8]
  • Enabled multiple HIP streams per SYCL queue. [e0c40a9]
  • Implemented lazy mechanism of setting context for default-constructed events. [ed92c4c]
  • Improved performance for multi-dimensional accessors with multiple accesses
    in a kernel. [7c58b9a]

SYCL Compiler

  • Increased max _Bitint size to 4096 for FPGA target. [db5f72a] [3f06cad]
  • Removed deprecation message for [[intel::disable_loop_pipelining]] attribute. [07201f5]
  • Allowed __builtin_assume_aligned to be called from device code. [24937ea]
  • Improved link step performance when per_kernel device code split is used. [84de9d6]
  • Added support for SYCL_EXTERNAL on device_global variables. [8b958f6]
  • Updated __builtin_intel_fpga_mem to accept more parameters. [231338d]
  • Updated ivdep attribute to allow safelen = 0. [558b3ba]
  • Improved linking with sycl.lib on Windows. [404d281]
  • Implemented more diagnostics about incorrect device_global usages. [1265721]
  • Improved library resolution for libsycl.so. [4ce19d6]
  • Improved diagnostics when linking with mismatched objects. [0e0202e]
  • Added a warning for floating-point size changes after implicit conversions. [e4f5d55]
  • Made invoke_simd convert its argument to appropriate types. [038764f]

Documentation

  • Removed explicit cl namespace references. [433ea5c]
  • Added a short guideline on using CMake with SYCL compiler. [fa603c3]

Bug fixes

SYCL Library

  • Fixed a compilation issue where it wasn't possible to pass an initializer list
    for dependency events vector in queue shortcuts with offset
    parameter. [f4f83d9]
  • Fixed sycl::get_pointer_device throwing an exception when it passed a
    descendent device (sub-device) instead of a root device. [26d5d98]
  • Fixed memory leak happening when kernel bundles are linked. [980677d]
  • Fixed USM free throwing an exception when it passed a context created for
    a descendent device. [c49d494]
  • Fixed accessor's CTAD for g++ host compiler. [57aabe7]
  • Fixed a compilation issue when using multi-dimensional accessor's subscript
    operator. [22e3fc5]
  • Fixed "definition with the same mangled name" error happening when used
    multiple buffer reductions in a kernel. [a0a4d72]
  • Fixed a compilation issue with SYCL math built-ins when GCC < 11.1 is used as
    a host compiler. [c786894]
  • Fixed a compilation issue with SYCL math built-ins (such as sycl::modf,
    for example) not accepting pointers to half. [e286166]
  • Fixed an issues with reductions when MSVC is used as host compiler. [94c4b80]
  • Fixed a compilation issue when fully specialized sycl::span is initialized
    from an array. [2b50820]
  • Fixed a crash in...
Read more

oneAPI DPC++ Compiler dependencies

12 Aug 01:09
decc8fe
Compare
Choose a tag to compare
Pre-release

This release contains OpenCL RT for Intel CPU and FPGA emulator used for oneAPI DPC++ Compiler and runtime validation
Please, see the runtime installation guide here.