Releases · intel/DML

26 Sep 23:04

abdelrahim-hentabli

v1.2.0

f59ed47

Intel DML v1.2.0 Latest

Latest

Functionality

Introduced a new internal submission mechanism for platforms based on Linux* OS kernel versions where MMAP is no longer permitted. For more details, refer to the Intel Security Advisory. When MMAP is unavailable, the write system call is used instead. This may introduce additional overhead for smaller data sizes (smaller than 16KB), that results in slightly higher Latency and lower Throughput.
Updated the DML device search mechanism to a new default behavior. Now, the platforms with Sub-NUMA clustering configured such that not all NUMA nodes have an accelerator instance can utilize any DSA instance from the same socket for execution. If more fine-grained control is needed, the Low-Level API of the library provides the ability to select devices from a specific NUMA node using the numa_id field in the job structure.
Introduced a new Low-Level API function dml_batch_get_crc() which retrieves the resulting CRC from a CRC operation.

Usability and Documentation

Extended examples to use new operation dml_batch_get_crc() and also to clarify use of crc seed for CRC operation.

Known Limitations

Intel(R) DML could be built from directly downloadable files (.tar, .tgz) only without tests and benchmark frameworks, using the -DDML_BUILD_TESTS=OFF build option since they require submodules that are not included in the archives by GitHub* during release creation.
Delta Record operations are not currently supported on the hardware_path.
Batch operation is currently not supported for the platforms based on Linux* OS kernel versions where MMAP is not permitted.
Known test failures are listed below:
- block_on_fault/apply_delta_page_fault.read/1

Assets 2

04 Apr 21:19

mzhukova

v1.1.2

8224bea

Intel DML v1.1.2

This is a patch release containing the following change to v1.1.1:

Bug Fixes

Fixed possible "_FORTIFY_SOURCE redefined" build warning/error. Some GCC* builds could internally set _FORTIFY_SOURCE and that could have resulted into DML build error.

Known Issues / Limitations

Intel DML can be built from directly downloadable files (.tar, .tgz) only without tests and benchmark frameworks, using the
-DDML_BUILD_TESTS=OFF build option because they require submodules that are not included in the archives by GitHub during release creation.
Known test failures are listed below. Some tests fail only under certain conditions, which are noted in parentheses.
- (hardware_path, auto_path) block_on_fault/apply_delta_page_fault.read/1.
- (hardware_path on DSA 2.0) dml_drain.ta_default_parameters test could hang when DSA 2.0 is used.
There is an issue on the auto_path for continuation after page fault if the page fault occurred on a pattern boundary (for fill or compare_pattern operations). Where part of the pattern is used before page fault, and the pattern is restarted from the beginning after page fault.

Assets 2

23 Oct 17:35

abdelrahim-hentabli

v1.1.1

e0c2d3d

Intel DML v1.1.1

This is a patch release containing the following changes to v1.1.0:

Usability and Documentation

Created a Contributing Guide and Pull Request template.

Bug Fixes

Fixed incorrect Page Fault handling on automatic path.
Fixed warning when building with Clang and -Wstrict-prototypes.
Added missing <stdexcept> header that caused a build failure with Clang compiler.
Fixed incorrect job finalization in Low-Level API multi socket example.
Fixed various issues flagged by the static code analysis tool.
Fixed outdated link in README file.

Known Issues / Limitations

Intel DML can be built from directly downloadable files (.tar, .tgz) only without tests and benchmark frameworks, using the
-DDML_BUILD_TESTS=OFF build option, because they require submodules that are not included in the archives by GitHub during release creation.
Known test failures are listed below. Some tests fail only under certain conditions, which are noted in parentheses.
- (hw/auto)block_on_fault/apply_delta_page_fault.read/1
There is an issue on the auto path for continuation after pagefault, if the pagefault occured on a pattern boundary (for fill, compare_pattern operations). Where part of the pattern is used before pagefault, and the pattern is restarted from the beginning after pagefault.

Thanks to the Contributors

Release includes contributions from the project team as well as @haiyuewa.

Contributors

haiyuewa

Assets 2

19 Jul 17:34

abdelrahim-hentabli

v1.1.0

a7c183e

Intel DML v1.1.0

Functionality

Introduced Block on Fault support for High-Level and Low-Level APIs.
Added Initial support for Intel(R) Data Streaming Accelerator 2.0.
Added Clang* compiler support for Build and Testing.

Usability and Documentation

Clarified the NUMA* support in the Quick Start section of Documentation.
Updated the Installed package structure to comply with the Linux* OS file-system hierarchy.
Extended returned status codes in case of queue submission errors for more accessible issues reporting.
Updated GoogleTest* and Google* Benchmarks submodules to the latest released version.
Reworked Low-Level API examples and added an option to select the execution path.
Added a warning into the Documentation about the handler lifetime and usage.

Deprecations

Deprecated .dont_invalidate_cache() method (High-Level API) and DML_FLAG_DONT_INVALIDATE_CACHE (Low-Level API) for Cache Flush operation.

Issues Fixed

Fixed various build issues with GCC* 12.
Fixed Create Delta Record on an automatic path for the case when Page Faults happened. Previously, in partial completion, Create Delta Record was not updated before re-submitting to the software path.
Fixed asynchronous execution using the automatic path to handle partial completion due to page fault correctly.

Known Limitations

Intel(R) DML could be built from directly downloadable files (.tar, .tgz) only without tests and benchmark frameworks, using the -DDML_BUILD_TESTS=OFF build option since they require submodules that are not included in the archives by GitHub* during release creation.
Known test failures are listed below:
- block_on_fault/apply_delta_page_fault.read/1

Assets 2

30 Mar 16:24

mzhukova

v1.0.0

6086665

Intel DML v1.0.0

Functionality

Added Benchmark Framework with limited support. Refer to the Benchmark Framework Guide in the documentation for details regarding what is supported and how it can be used.
Added no-operation (no-op) support to High-Level API that can be used in Batch operation as Fence.
Added support of umonitor/umwait to Low-Level Job API (refer to dml_wait_mode_t enum).
Added DML_MIN_BATCH_SIZE macro to expose the minimum required batch size.
Added more status codes reporting for Low-Level Job API to allow reporting of all Intel DSA statuses.

Usability and Documentation

Removed limitation that libaccel-config.so.1 must be placed in /usr/lib64/ to execute an application that uses Intel(R) DML. Now user can specify its location using LD_LIBRARY_PATH environment variable.
Introduced the -DDML_BUILD_{TESTS, EXAMPLES} option (by default, is ON). -DDML_BUILD_TESTS=OFF enables you to build the library without testing from directly downloadable files (.tar, .tgz).
Improved High-Level API examples by setting the execution path based on a command-line argument instead of hardcoding to use the Software Path.
Restructured documentation and introduced general improvements and updates.

Deprecated Functionality

Removed dml_get_limits(...) service function.
Removed EFFICIENT_WAIT build option. Now user should set DML_WAIT_MODE_UMWAIT (refer to dml_wait_mode_t enum) when using Low-Level Job API in order to enable umonitor/umwait.

Breaking Changes

Changed API for Low-Level API dml_execute_job(...) and dml_wait_job(...) to include dml_wait_mode_t wait_mode parameter.

Bug Fixes

Fixed GCC* 11 build failures caused by missing headers.
Fixed incorrect queue submission mechanism that might have led to segmentation fault with previous DML releases.

Known issues/limitations

Intel DML could be built from downloadable files directly (.tar, .tgz) only without tests and benchmark frameworks using the -DDML_BUILD_TESTS=OFF build option. They require submodules that are not included in the archives on GitHub* during release creation.
Known test failures for Hardware Path are listed below.
- dml_cache_flush.ta_do_not_invalidate
- dmlhl_cache_flush/{2, 3}.dont_invalidate
- transfer_size/cache_flush.success/{1, 3, 5, 7, 9, 11, 13}
- alignment/cache_flush.success/{1, 3, 5, 7}
- create_delta.page_fault_{read_first, read_second, write}

Assets 2

21 Mar 17:29

Smirnov1gor

v0.1.9-beta

55465dd

v0.1.9-beta

Intel® DML v0.1.9-beta

Date: March 2022

Note: Release introduces a test system for the library.

Features:

Added tests for the library under the tests/ folder
Added example for multi-socket utilization of the library in the Code Samples and Examples section

Assets 2

22 Feb 17:57

Smirnov1gor

v0.1.8-beta

5a29563

v0.1.8-beta

Intel® DML v0.1.8-beta

Date: February 2022

Note: Release introduces the auto execution path and manual NUMA selection for C++ API as well as several page fault handling bugfixes.

Features:

Implemented the auto execution path (software fallback) for C++ API. The library tries to use hardware, but in case it is unavailable, there is a software fallback.
Added numa_id parameter for dml::execute and dml::submit functions to specify custom NUMA node id for submission. Setting a number allows the library to do cross-socket submissions.
Removed DML_HW cmake option. The library is built with HW support by default.
Added dynamic optimization dispatcher. The library checks if a necessary instruction set is supported on the system at runtime.

Bug fix:

Fixed erroneous results for Compare operations when a page fault occurred during processing.
Fixed wrong detection for the on-write page faults.

Optimizations:

Optimized reflected CRC operation.

Assets 2

25 Jan 20:47

Smirnov1gor

v0.1.7-beta

2bbcac3

v0.1.7-beta

Intel® DML v0.1.6-beta

Date: January 2022

Note: Release introduces initial implementation for the auto execution path, page fault handling, and manual NUMA node selection API

Features:

Implemented the auto execution path (software fallback) for C API. The library will try to use hardware, but in case it is unavailable there is a software fallback.
Added page fault handling:
- Removed usage of BlockOnFault flag
- If page fault occurred during descriptor processing:
  - For the hardware execution path an erroneous status is returned
  - For the auto execution path there is a software fallback, so the remainder of the workload is processed on CPU.
Added numa_id field for dml_job_t structure to specify custom NUMA node id for submission. Setting a number allows the library to do cross-socket submissions.

Optimizations:

Optimized CRC operation for short lengths

Assets 2

10 Dec 13:47

Smirnov1gor

v0.1.6-beta

64ce5ff

v0.1.6-beta

Intel® DML v0.1.6-beta

Date: December 2021

Note: Release introduces bug fixes and several minor improvements

Features:

Improved incorrect input checking
Added check for adjacent buffers for the DIF Strip operation. Status: DML_STATUS_DIF_STRIP_ADJACENT_ERROR
Reworked hardware related statuses for C API
Added new status to indicate submission failure:
- DML_STATUS_WORK_QUEUES_NOT_AVAILABLE for C API
- dml::status_code::queue_busy for C++ API
Removed LIBACCEL_3_2 cmake option. The supported version of accel-config is now 3.2 and higher
NUMA node id is detected before each submission now, so threads are safe to change nodes at any time

Bug fix:

Fixed the issue when batch operation doesn't work for buffer not aligned on 64 bytes boundary
Fixed the issue when current thread NUMA node id is deduced incorrectly
Fixed crashes when there are no available devices for the current thread NUMA node id
Removed dependencies on C++ runtime from C API

Warnings:

As NUMA node id of the current thread is now deduced correctly, ensure that accelerators' configuration is compatible. The library does no cross-socket submissions. If there is no available device for the current NUMA node id, then an error status code is reported.

Assets 2

23 Nov 10:20

Smirnov1gor

v0.1.5-beta

69fbcfb

v0.1.5-beta

Intel® DML v0.1.5-beta

Date: November 2021

Note: Release introduces unification of underlying implementation for both C and C++ APIs

Features:

Added internal device selection logic to C API (the same as for C++ API)
- Selector considers submitting thread's NUMA node id
- Selector switches devices and work queues with each submission
Improved range checking for C and C++ APIs

Bug fix:

Lowered memory size requirements for job structure by ~100x.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Functionality

Usability and Documentation

Known Limitations

Bug Fixes

Known Issues / Limitations

Usability and Documentation

Bug Fixes

Known Issues / Limitations

Thanks to the Contributors

Contributors

Functionality

Usability and Documentation

Deprecations

Issues Fixed

Known Limitations

Functionality

Usability and Documentation

Deprecated Functionality

Breaking Changes

Bug Fixes

Known issues/limitations

Intel® DML v0.1.9-beta

Intel® DML v0.1.8-beta

Intel® DML v0.1.6-beta

Intel® DML v0.1.6-beta

Intel® DML v0.1.5-beta

Releases: intel/DML

Intel DML v1.2.0

Functionality

Usability and Documentation

Known Limitations

Intel DML v1.1.2

Bug Fixes

Known Issues / Limitations

Intel DML v1.1.1

Usability and Documentation

Bug Fixes

Known Issues / Limitations

Thanks to the Contributors

Contributors

Intel DML v1.1.0

Functionality

Usability and Documentation

Deprecations

Issues Fixed

Known Limitations

Intel DML v1.0.0

Functionality

Usability and Documentation

Deprecated Functionality

Breaking Changes

Bug Fixes

Known issues/limitations

v0.1.9-beta

Intel® DML v0.1.9-beta

v0.1.8-beta

Intel® DML v0.1.8-beta

v0.1.7-beta

Intel® DML v0.1.6-beta

v0.1.6-beta

Intel® DML v0.1.6-beta

v0.1.5-beta

Intel® DML v0.1.5-beta