Releases: devitocodes/devito
Releases · devitocodes/devito
v4.8.8
Changes
API
Compiler
- compiler: Revamp code generation for asynchronous operations @FabioLuporini (#2376)
- compiler: Tweaks to enable decoupling on GPUs @FabioLuporini (#2385)
🐛 Bug Fixes
- API: fix printer dtype processing @mloubout (#2388)
- mpi: Avoid UnboundLocalVariable error @FabioLuporini (#2386)
Continuous Integration
Installation
- deps: support numpy 2.0 @mloubout (#2391)
- pip prod(deps): update psutil requirement from <6.0,>=5.1.0 to >=5.1.0,<7.0 @dependabot (#2389)
Full Changelog: v4.8.7...v4.8.8
v4.8.7
Changes
API
- mpi: Fix data_gather for sparse functions @mloubout (#2379)
- API: Revamp sparse subfunction @mloubout (#2374)
- api: fix corner case staggered fd for centered x0 @mloubout (#2373)
- api: cleanup FD tools and support zeroth order derivative @mloubout (#2368)
Examples
- examples: Add an example notebook for ADER-FD schemes @EdCaunt (#2338)
- examples: fix tuto numbering for doc rendering @mloubout (#2367)
Documentation
Compiler
- compiler: Misc compiler improvements @FabioLuporini (#2380)
- compiler: Support for C-level MPI_Allreduce @FabioLuporini (#2344)
- compiler: sequentialise halo touch @mloubout (#2372)
- compiler: Fix placement of ConditionalDimension in subdomain @georgebisbas (#2050)
MPI
- mpi: Fix data_gather for sparse functions @mloubout (#2379)
- ci: tweak mpi setup to allow -s and hide output @mloubout (#2350)
Architectures and JIT
🐛 Bug Fixes
- tests: Fixup conftest's set_run_reset() @FabioLuporini (#2381)
- mpi: Fix data_gather for sparse functions @mloubout (#2379)
- dsl: Correct retention of wrong branch in filter_ordered @EdCaunt (#2377)
- api: Always expand time derivatives @mloubout (#2369)
- dsl: Patch edge-case derivative specifications @EdCaunt (#2366)
- compiler: Fix placement of ConditionalDimension in subdomain @georgebisbas (#2050)
Testing
Continuous Integration
- CI: fix decoupler config to use correct python @mloubout (#2382)
- arch: Ensure compiler check catches permission errors @gbruer15 (#2340)
Installation
- reqs: Update cached_property to functools version @EdCaunt (#2359)
- docker: Drop unused mpi4 and fix nvhpc mpi4py setup @mloubout (#2365)
New Contributors
Full Changelog: v4.8.6...v4.8.7
v4.8.6
v4.8.5
Fix Manifest.in
for missing pypi files
Full Changelog: v4.8.4...v4.8.5
v4.8.4
Changes
- compiler: Patch double buffering @FabioLuporini (#2247)
- compiler: Fix unexpansion w custom coeffs @FabioLuporini (#2242)
API
- api: Cleanup sparse setup for precomputed @mloubout (#2353)
- api: Add Hicks (sinc) interpolation api @mloubout (#2342)
- api: add priority to fd coefficients for mixed derivatives @mloubout (#2331)
- api: add support for 45 degree FD approx @mloubout (#2326)
- api: Fix custom fd for staggered @mloubout (#2323)
- api: Fix gpu-fit for TensorFunction @mloubout (#2285)
- misc: Switch off develop-mode @FabioLuporini (#2280)
- api: Minor fixes to arithmetic operations with scalar and tensors @mloubout (#2276)
- misc: Process args for subdimensions @mloubout (#2266)
- api: Add shift and fd order option to all FD operators: @mloubout (#2258)
- api: Always make subsampling factor symbolic @mloubout (#2259)
- api: prevent derivative shortcut with incompatible fd order @mloubout (#2254)
Examples
- examples: Interpolation tutorial notebook @mloubout (#2252)
- examples: Update MPI notebook to reference inner and outer halo terminology @EdCaunt (#2319)
- Correct the Poisson equation in the cavity flow example @rafael-fuente (#2308)
- example: small cleanup of tti for easier reuse @mloubout (#2294)
Documentation
- misc: Add MPI0 logging level @georgebisbas (#2130)
- examples: Fix typo in tutorial numbering @EdCaunt (#2356)
- misc: Docstring updates @ZoeLeibowitz (#2223)
Compiler
- compiler: Tweak check_stability to ensure cleanup is performed @FabioLuporini (#2335)
- compiler: Patch Guards.simplify_and @FabioLuporini (#2334)
- compiler: Enable generation of templated function declarations @FabioLuporini (#2333)
- compiler: Add optional pass for runtime stability check @FabioLuporini (#2327)
- compiler: Tweak Weights.value @FabioLuporini (#2328)
- compiler: Add Weights.value utility @FabioLuporini (#2322)
- compiler: Revamp lowering of IndexDerivatives @FabioLuporini (#2310)
- compiler: Revamp linearization @FabioLuporini (#2317)
- compiler: Adjust names used for cire-rotate dimensions @EdCaunt (#2305)
- compiler: Optimize normalize_reductions_dense @FabioLuporini (#2311)
- compiler: Generate less integer arithmetic @FabioLuporini (#2301)
- compiler: Misc codegen enhancements @FabioLuporini (#2300)
- compiler: Revamp data alignment @FabioLuporini (#2296)
- compiler: Improve IndexDerivative lowering @FabioLuporini (#2288)
- compiler: Misc code generation improvements @FabioLuporini (#2282)
- compiler: Fix handling of redundant derivatives @FabioLuporini (#2284)
- compiler: Introduce cluster-level Temp @georgebisbas (#2281)
- compiler: Add pass to abridge SubDimension names where possible @EdCaunt (#2269)
- compiler: Improve quality of generated code @FabioLuporini (#2263)
- compiler: Add missing numpy dtypes @mloubout (#2271)
- compiler: Machinery to generate vector types @FabioLuporini (#2253)
- compiler: Introduce symbolic fencing @FabioLuporini (#2244)
- compiler: Improve robustness of dspace derivation @FabioLuporini (#2238)
MPI
- misc: Add MPI0 logging level @georgebisbas (#2130)
- CI: revamp parallel marker @mloubout (#2347)
- mpi: Generate deterministic code for overlap mode @georgebisbas (#2303)
- MPI: Fix sparse subfunction handling when used without parent @mloubout (#2278)
- mpi: Fix haloupdate with inner dim [v2] @FabioLuporini (#2272)
- mpi: Add utility to get number of ranks on a single node @mloubout (#2265)
- dsl: Patch domain decomposition bug with SubDomains @EdCaunt (#2246)
Architectures and JIT
- Use get_nvidia_cc to get Nvidia gpu architecture @ggorman (#2343)
- arch: Add denormal flag for clang @mloubout (#2304)
- arch: patch compiler version @mloubout (#2297)
- example: small cleanup of tti for easier reuse @mloubout (#2294)
- arch: support rocm for gpu info @mloubout (#2261)
- compiler: add extra platforms and language to the custom compiler @mloubout (#2255)
- arch: Intel PVC mapping @FabioLuporini (#2215)
🐛 Bug Fixes
- compiler: Make code gen of elementary funcs dtype-aware @FabioLuporini (#2349)
- compiler: Tweak device-aware blocking @FabioLuporini (#2348)
- compiler: Hotfix unevaluation.Pow(1, ...) @FabioLuporini (#2321)
- compiler: Fix min/max reductions to be backend-portable @FabioLuporini (#2315)
- misc: Use
str
for generalization @mloubout (#2313) - compiler: Block reductions irrespective of par-tile @FabioLuporini (#2309)
- compiler: Fix space conditions with loop blocking @FabioLuporini (#2302)
- data: Prevent allocator info to be lost at finalize @mloubout (#2295)
- misc: Fix gpu-fit for multiple tensors @mloubout (#2286)
- compiler: Fix minor codegen issues after pickling @FabioLuporini (#2283)
- misc: Replace dimension check in pull_dims @EdCaunt (#2275)
- misc: fix short/ushort codegen @mloubout (#2274)
- mpi: Fix haloupdate with inner dim [v2] @FabioLuporini (#2272)
- misc: fix UnboundTuple for None partile @mloubout (#2256)
- compiler: Hotfix compare-ops @FabioLuporini (#2251)
- compiler: Patch compare_ops for IndexDerivatives @FabioLuporini (#2250)
- dsl: Patch domain decomposition bug with SubDomains @EdCaunt (#2246)
- compiler: Patch symbolic coefficients over cross derivatives @FabioLuporini (#2248)
- compiler: Patch custom coefficients @FabioLuporini (#2243)
Testing
Continuous Integration
- docker: fix oneapi setup @mloubout (#2351)
- ci: Update actions for nodejs version deprecation @georgebisbas (#2312)
- deps: Update rocm version @mloubout (#2291)
- compiler: Check DeviceFunctions for SubDimensions @EdCaunt (#2279)
Installation
- pip prod(deps): update ipyparallel requirement from <8.8 to <8.9 @dependabot (#2346)
- pip prod(deps): bump pyrevolve from 2.2.3 to 2.2.4 @dependabot (#2337)
- pip prod(deps): update ipyparallel requirement from <8.7 to <8.8 @dependabot (#2324)
- deps: prevent codecov error on local docker @mloubout (#2318)
- pip prod(deps): update pytest requirement from <8.0,>=7.2 to >=7.2,<9.0 @dependabot (#2299)
- deps: Update rocm version @mloubout (#2291)
- deps: support python 3.12 @mloubout (#2270)
- pip prod(deps): update anytree requirement from <=2.12.0,>=2.4.3 to >=2.4.3,<=2.12.1 @dependabot (#2268)
- pip prod(deps): update anytree requirement from <=2.11.1,>=2.4.3 to >=2.4.3,<=2.12.0 @dependabot (#2249)
- pip prod(deps): update anytree requirement from <=2.10.0,>=2.4.3 to >=2.4.3,<=2.11.1 @dependabot (#2241)
New Contributors
- @rafael-fuente made their first contribution in #2308
- @ZoeLeibowitz made their first contribution in #2223
Full Changelog: v4.8.3...v4.8.4
v4.8.3
Changes
- Andrew add tests to test operator 2194 @AndrewCheng827 (#2207)
API
- api: enforce interpolation radius to be smaller than any input space … @mloubout (#2234)
- api: cleanup SubDimension and SubDomain @mloubout (#2219)
- misc: various Dimension internal fixes @mloubout (#2205)
- api: Cleanup and improve SubFunction @mloubout (#2198)
- builtins: Support batched initialize_function @FabioLuporini (#2176)
- api: Revamp interpolation/injection @FabioLuporini (#2128)
Documentation
- Update FAQ.md @FabioLuporini (#2195)
- documentation: update ci for new website setup @mloubout (#2221)
Compiler
- compiler: prevent multisubdimension expressions duplicates @mloubout (#2230)
- compiler: prevent reduction clause for perfect-enough outer loops @mloubout (#2226)
- compiler: Rework multi-level buffering @FabioLuporini (#2225)
- compiler: Fix issue 2194 @AndrewCheng827 (#2212)
- compiler: prevent radius dependent temps for sparse operations @mloubout (#2216)
- api: Cleanup and improve SubFunction @mloubout (#2198)
- api: Revamp interpolation/injection @FabioLuporini (#2128)
🐛 Bug Fixes
- compiler: Patch cluster.is_sparse @FabioLuporini (#2232)
- compiler: Fix reduction over sparse only @mloubout (#2220)
- compiler: prevent temporary for local reductions @mloubout (#2218)
- compiler: prevent radius dependent temps for sparse operations @mloubout (#2216)
- compiler: fix arg processing for empty arg update @mloubout (#2213)
- misc: various Dimension internal fixes @mloubout (#2205)
Continuous Integration
Installation
- pip prod(deps): update anytree requirement from <=2.9.0,>=2.4.3 to >=2.4.3,<=2.10.0 @dependabot (#2233)
- deps: fix intel drivers @mloubout (#2228)
- ci: add intel missing gpu drivers @mloubout (#2227)
Full Changelog: v4.8.2...v4.8.3
v4.8.2
API
- dsl: Removed dynamic classes for AbstractFunctions (fixes memory leaks seen by some users) @FabioLuporini (#2190)
- api: Fix symbolic coefficients for cross derivatives @mloubout (#2185)
- api: Allow parametric par-tile as input @FabioLuporini (#2168)
- api: Use subs for origin in the case where index is a function @mloubout (#2120)
- compiler: Add utility function to normalize sympy arguments @FabioLuporini (#2125)
Examples
Compiler
- compiler: Keep -qopenmp by default after icx 2023.2 @georgebisbas (#2164)
- compiler: Add opkwargs property to ArgumentsMap @ccuetom (#2142)
- compiler: Misc compiler fixes and improvements -- part II @FabioLuporini (#2138)
- compiler: Pass operator arguments to downstream operators @ccuetom (#2139)
- compiler: Improve lowering of IndexDerivatives @FabioLuporini (#2112)
- compiler: Misc compiler tweaks and improvements @FabioLuporini (#2136)
- compiler: Avoid generating collapse(1) @FabioLuporini (#2129)
- compiler: Patch pickling of GuardFactor and reconstruction @FabioLuporini (#2126)
- compiler: Introduce gpu-create parameter for buffers initialized on device @ccuetom (#2107)
- compiler: Change tile use in DeviceAcczier to allow multiple tile sizes @gpc1064 (#2095)
- compiler: Add host-*-pin handles; more volatile with pthreads @FabioLuporini (#2116)
- compiler: Support template parameters @FabioLuporini (#2105)
MPI
- mpi: Instrument compute0 core after specialising as ComputeCall @georgebisbas (#2143)
- mpi: Enhance flexibility for custom topologies @georgebisbas (#2134)
- mpi: Packed gathers and scatters @FabioLuporini (#2109)
GPU
Architectures and JIT
- compiler: Enable AVX512 compiler support when available. @ggorman (#2184)
- arch: Correct
march
tomcpu
for ppc @raminammour (#2174) - misc: Add deviceid to configuration and enhance switchconfig @ccuetom (#2175)
- arch: Add ICX support @georgebisbas (#2051)
🐛 Bug Fixes
- dsl: Removed dynamic classes for AbstractFunctions @FabioLuporini (#2190)
- dsl: Prevent aggregation for symbolic coefficients @mloubout (#2182)
- api: Prevent factorization for symbolic coefficients @mloubout (#2179)
- compiler: Prevent Eq dims to be lost if only implicit @mloubout (#2169)
- compiler: Fix non-arithmetic distances @mloubout (#2165)
- compiler: Prevent adding breaking guard to nokey @mloubout (#2160)
- compiler: Add guards to prevent OOB when streaming buffers with ConditionalDimension @ccuetom (#2150)
- compiler: Fix CondDim's factor auto-override @FabioLuporini (#2154)
- compiler: Fix pickling of aliasing SparseFunction @FabioLuporini (#2148)
- compiler: Revert "compiler: Relax WaitLock regions in a ScheduleTree" @FabioLuporini (#2141)
- compiler: Patch pickling of GuardFactor and reconstruction @FabioLuporini (#2126)
- compiler: Fix OpenMP reductions in tandem with linearize=True @FabioLuporini (#2117)
Testing
- misc: fix openmp= deprecation @mloubout (#2186)
- Removing AWS ondemand gh-runners from CI @ggorman (#2155)
Continuous Integration
- CI: pytest setup fix @mloubout (#2177)
- docker: add some tweaks to Nvidia docker @mloubout (#2171)
- CI: Fix asv setup @mloubout (#2167)
- CI: Fix asv devito install @mloubout (#2166)
- ci: Add python 3.11 and minor CI fixing @georgebisbas (#2158)
- docker: revamp base deployment @mloubout (#2162)
- ci: switch to concurrency settings rather than extra action @mloubout (#2119)
Installation
- docker: Add intel advisor to icx image @mloubout (#2180)
- docker: Add some tweaks to nvidia docker @mloubout (#2171)
- docker: Switch to intelpython for icc/icx build @mloubout (#2172)
- docker: Revamp base deployment @mloubout (#2162)
- pip prod(deps): update distributed requirement from <2023.7 to <2023.8 @dependabot (#2161)
- pip prod(deps): update anytree requirement from <=2.8,>=2.4.3 to >=2.4.3,<=2.9.0 @dependabot (#2152)
- pip prod(deps): update distributed requirement from <2023.6 to <2023.7 @dependabot (#2145)
- pip prod(deps): update distributed requirement from <2023.5 to <2023.6 @dependabot (#2127)
- deps: sympy 1.12 compat @mloubout (#2123)
- pip prod(deps): update distributed requirement from <2023.4 to <2023.5 @dependabot (#2118)
- reqs: Move pyrevolve to optionals and introduce testing-only reqs @georgebisbas (#2096)
- reqs: Fix for matplotlib >=3.6.3 @georgebisbas (#2047)
- install: Make mpi4py portable across Intel and AMD @FabioLuporini (#2115)
- install: Overhaul Dockerfile.amd for MPI support @FabioLuporini (#2104)
- pip prod(deps): update distributed requirement from <2023.4 to <2023.5 @dependabot (#2110)
- pip prod(deps): update ipyparallel requirement from <8.6 to <8.7 @dependabot (#2106)
New Contributors
- @gpc1064 made their first contribution in #2095
- @raminammour made their first contribution in #2174
Full Changelog: v4.8.1...v4.8.2
v4.8.1
Changes
Examples
Compiler
- compiler: Revamp compilation of halo exchanges @FabioLuporini (#2089)
- compiler: Do block with partial data reuse @FabioLuporini (#2094)
- compiler: Relax _mark_overlappable @FabioLuporini (#2069)
Architectures and JIT
- compiler: only use offloading flags for amd gpu not cpu @mloubout (#2087)
- arch: Support aws graviton @mloubout (#2080)
🐛 Bug Fixes
- compiler: Patch loop nests a-la PrecomputedInterpolation @FabioLuporini (#2102)
- compiler: Prevent topofusion of homogeneous sync Clusters @FabioLuporini (#2099)
- compiler: Patch local-SubDim heuristic for blocking @FabioLuporini (#2093)
- compiler: Fix and improve MIN/MAX codegen @FabioLuporini (#2076)
- symbolics: evaluate transpose on elements of tensors in case of deriv… @mloubout (#2072)
- compiler: Handle IndexedBase DDA across Jumps/Barriers @FabioLuporini (#2070)
Testing
- tests: Fix warnings @georgebisbas (#2079)
Continuous Integration
- ci: make aws ci parallel @mloubout (#2085)
- tests: Fix warnings @georgebisbas (#2079)
- arch: Support aws graviton @mloubout (#2080)
Installation
- pip prod(deps): update ipyparallel requirement from <8.5 to <8.6 @dependabot (#2090)
- docker: Updating AMD dockerfile to use Ubuntu 22.04. @ggorman (#2084)
- pip prod(deps): update distributed requirement from <2023.3 to <2023.4 @dependabot (#2074)
v4.8.0
Changes
misc
API
- symbolics: use devito floor instead of Undefined Function @mloubout (#2052)
- api: Add dimension-wise summing builtin and tests @mloubout (#1989)
Examples
- Examples: Add Darcy flow example @sashaowen (#1998)
- examples: Add shallow water equations notebook @AtilaSaraiva (#1867)
- examples: invoke tti example with --constant argument @ofmla (#1914)
Documentation
- docs: Update compiler summary image @georgebisbas (#2037)
- misc: Update docker/README.md @FabioLuporini (#1972)
- Update FAQ.md @FabioLuporini (#2010)
- misc: Add FAQ page (lifted from the wiki) @FabioLuporini (#1976)
Compiler
- compiler: Patch data dependencies across Jumps @FabioLuporini (#2065)
- compiler: Implement graceful lowering of derivatives (aka "unexpansion") @FabioLuporini (#2060)
- compiler: Switch from aomp to clang for amd @mloubout (#2058)
- compiler: Extensions for parlang backends @FabioLuporini (#2042)
- compiler: Introduce int32 mode @FabioLuporini (#2041)
- compiler: Support shared memory in parlang backends @FabioLuporini (#2025)
- compiler: support for HPE Cray Clang compiler @georgebisbas (#2029)
- compiler: Better blocking heuristics and revamped linearization @FabioLuporini (#2020)
- compiler: Further misc improvements @FabioLuporini (#2012)
- compiler: Misc refactorings towards serialization support @FabioLuporini (#2009)
- Misc code generation improvements @FabioLuporini (#2001)
- compiler: Misc code generation fixes @FabioLuporini (#1994)
- compiler: Misc tweaks for backend-portable code generation @FabioLuporini (#1984)
MPI
- return slice(0,-1) for glb_slices if glb_numb empty on an mpi rank @deckerla (#2004)
- mpi: Fix data distribution bugs [part 2] @rhodrin (#1949)
- compiler: Fix MPI mode diag2 does not need a MPIRegion @mmohrhard (#1992)
Architectures and JIT
- arch: Add gcc 12 into legal configurations @ziyiyin97 (#2027)
🐛 Bug Fixes
- compiler: Add cluster guard to AliasKey for safety @mloubout (#2045)
- dependencies: sympy 1.11 compatibility @mloubout (#2005)
- ci: Add Arm skip option to tests @mloubout (#2035)
- compiler: Ensure order invariance of candidates in ReducerMap.unique @ccuetom (#2033)
- compiler: Fix subdim argument mismatch @mloubout (#2019)
- symbolics: Fix absolute value warning for integer input @mloubout (#2018)
- types: Minor fixes to sparse function indices and implicit dims @mloubout (#2011)
- compiler: Patch placement of ConditionalDimension with multi-Dimension conditions @FabioLuporini (#2008)
- compiler: Prevent reordering of existing temps in CSE @mloubout (#2002)
- return slice(0,-1) for glb_slices if glb_numb empty on an mpi rank @deckerla (#2004)
- mpi: Fix data distribution bugs [part 2] @rhodrin (#1949)
- dsl: Patched cross-derivative fd_order bug @EdCaunt (#1988)
- compiler: Check jit_dir existence when saving @GlassOfWhiskey (#1983)
Testing
- ci: Remove docker pruning from pytest-gpu @FabioLuporini (#2013)
Continuous Integration
- docker: add AMD HIP build to base docker @mloubout (#2055)
- CI: switch macos runner to latest gcc version @mloubout (#2046)
- ci: improve accuracy of codecov. @ggorman (#2040)
- ci: Streamlining @ggorman (#2028)
- ci: Remove dangling docker layers @FabioLuporini (#2017)
Installation
- pip prod(deps): update distributed requirement from <2023.2 to <2023.3 @dependabot (#2063)
- compiler: Switch from aomp to clang for amd @mloubout (#2058)
- docker: Switch to rocm 4.5.2 @mloubout (#2057)
- docker: add AMD HIP build to base docker @mloubout (#2055)
- pip prod(deps): update distributed requirement from <2022.13 to <2023.2 @dependabot (#2048)
- dependencies: sympy 1.11 compatibility @mloubout (#2005)
- pip prod(deps): update distributed requirement from <2022.12 to <2022.13 @dependabot (#2039)
- pip prod(deps): update distributed requirement from <2022.11 to <2022.12 @dependabot (#2031)
- pip prod(deps): update py-cpuinfo requirement from <=8 to <10 @dependabot (#2026)
- misc: Add packages to Docker base images @FabioLuporini (#2030)
- pip prod(deps): update distributed requirement from <2022.10 to <2022.11 @dependabot (#2021)
- pip prod(deps): update distributed requirement from <2022.9 to <2022.10 @dependabot (#1996)
- pip prod(deps): update distributed requirement from <2022.8 to <2022.9 @dependabot (#1987)