You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
将 datamonitor 中统计的 dataset 和 transform 时间修改为一个 batch 的总时间,使其与 collator time 和 ipc time 统计口径保持一致。
MegEngine Lite
Bug Fixes
文档
修复 lite 中 get_elem_size 方法文档描述与实现不一致的问题。
MegEngine
HighLight
Added support for Cambrian MLU series AI chip training and inference.
know issue
When dump turns on CD4 + FP16, the clip phase diagram optimization is abnormal. MIN op related bugs cause dump errors. It is expected to be fixed in the next new version (MegBrian v8.20.4)
Bug fixes
Third-party hardware
Fix the problem of rocm compilation failure.
Fixed an issue where the checksum_kernel_union4 kernel could not be found on Cambrian 590.
Common components
Fixed the bug that the reshape operator does not support int64 shape input in trace mode.
Fixed the problem of incorrect calculation of tile operator workspace.
Fixed the issue where the seg transformer model cannot be dumped due to NHWCD4 optimization pass processing errors.
Fix megfile version dependency fixing problem.
Fix the problem of module_stats function calculating the traced_module model parameters and calculation amount reporting an error.
Optimize the error messages during asynchronous execution errors, providing users with methods to further locate issues.。
Provide more error information before throwing an exception when an error occurs during graph execution.
Fix the compilation error caused by the missing header file "limits".
Release process
Fix the problem that the megbrain cuda backend fails to pass when compiled without the MGE_WITH_CUSTOM_OP compilation parameter.
XLA
Fix the unstable occupation of cuda memory of xla.
Fix indexing problems with XLA.
Fix the problem that XLA cannot trace GradManager Callback.
Fix the problem that XLA cannot trace modules with property decorations.
CUDA
Temporarily closed two algorithms that call cudnn-v8 (AlgoCUDNNConvV8, AlgoCUDNNConvBiasActivationV8) to fix the bisection problem of calculation results.
Formal support for cuda11.8。
Documentation
Fixed loss of mgb.so in megengine.
New Features
Python API
Implements einsum operator.
Add exponential opr.
Added support for polynomial distribution sampling.
Add Remap module.
Add GaussianBlur module.
Third-party hardware
Cambrian platform supports neuware version 1.13.0.
Support Cambricon training and inference.
Common components
Add the dilate operator.
Fix memory leak issues in OHOS thread local storage.
XLA
Add fake quant and tqt operators to the xla backend.
Modify the dataset and transform time statistics in datamonitor to the total time of a batch to make it consistent with the statistical calibers of collator time and ipc time.
MegEngine Lite
Bug Fixes
Documentation
Fix the inconsistency between the documentation and implementation of the get_elem_size method in lite.