Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bzip2, lz4, and zstd independent compression filter packages #880

Merged
merged 30 commits into from
Dec 17, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
d7bcada
Add bzip2, lz4, and zstd filters from HDF5Plugins.jl
mkitti Nov 1, 2021
3561115
Create Filters interface def, add tests
mkitti Nov 2, 2021
b7b7f24
Filters return 0 rather than rethrowing error
mkitti Nov 15, 2021
a5100e5
Move blosc.jl to H5Zblosc.jl
mkitti Nov 15, 2021
424de91
Optimize method inference, init, precompile
mkitti Nov 15, 2021
cbfb653
Register filters as when dep packages are loaded
mkitti Nov 20, 2021
58f8a7c
Fix tests for lazy loaded filters
mkitti Nov 20, 2021
55bb1c5
Move plugin licenses to licenses folder
mkitti Nov 20, 2021
9a8f3e9
Add HHMI copyright to LICENSE.txt
mkitti Nov 20, 2021
b215ded
Remove generic register_filter method
mkitti Dec 2, 2021
26dbde0
Apply suggestions from code review for license notices
mkitti Dec 2, 2021
0e011dd
Implement generic register_filter on filter type
mkitti Dec 2, 2021
b253de1
Update test/filter.jl
mkitti Dec 2, 2021
84085b4
Define filters API on the type alone
mkitti Dec 8, 2021
d6d7fd4
H5Zblosc, H5Zbzip2, H5Zlz4, H5Zzstd as subdir pkgs
mkitti Dec 8, 2021
b46117a
Fix typos in H5Zlz4.jl
mkitti Dec 8, 2021
bc9af91
Use PackageSpec to make dev_embedded_filters Julia 1.3 compatible
mkitti Dec 8, 2021
12a613b
Add Licenses to subpackages
mkitti Dec 10, 2021
4aca896
H5Zbzip2: Add 32-bit Windows support
mkitti Dec 10, 2021
0bee6b9
Reorganized licenses, create root THIRDPARTY.md
mkitti Dec 11, 2021
fd1eccd
Debug tests for Julia 1.8
mkitti Dec 11, 2021
73ceacc
Revert GC.@preserve due to https://github.com/JuliaLang/julia/pull/43408
mkitti Dec 14, 2021
d4cf627
Remove debug for not deleted file in test/external.jl
mkitti Dec 14, 2021
e344882
Start H5Z* pkgs at 0.1.0
mkitti Dec 14, 2021
e23d739
Update Project.toml
mkitti Dec 15, 2021
ed1b342
Update Project.toml
musm Dec 15, 2021
55afa4f
Add filters as test targets
musm Dec 15, 2021
393711b
Update Project.toml
musm Dec 15, 2021
b2e16d3
Cosmetic changes and instantiate filters packages directly in test file
musm Dec 16, 2021
82fba8e
Update docs remove external exports
musm Dec 16, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ uuid = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"
version = "0.16.0"

[deps]
Blosc = "a74b3585-a348-5f62-a45c-50e91977d574"
Compat = "34da2185-b29b-5c13-b0c7-acf172513d20"
HDF5_jll = "0234f1f7-429e-5d53-9886-15a909be8d59"
Libdl = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
Expand All @@ -12,7 +11,6 @@ Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Requires = "ae029012-a4dd-5104-9daa-d747884805df"

[compat]
Blosc = "0.7.1"
Compat = "3.1.0"
HDF5_jll = "~1.10.5, ~1.12.0"
Requires = "1.0"
Expand All @@ -22,6 +20,10 @@ julia = "1.3"
CRC32c = "8bf52ea8-c179-5cab-976a-9e18b702a9bc"
Distributed = "8ba89e20-285c-5b6f-9357-94700520ee1b"
FileIO = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
H5Zblosc = "c8ec2601-a99c-407f-b158-e79c03c2f5f7"
H5Zbzip2 = "094576f2-1e46-4c84-8e32-c46c042eaaa2"
H5Zlz4 = "eb20ec05-5464-47b5-ba41-098e3c1068a3"
H5Zzstd = "f6f2d980-1ec6-471c-a70d-0270e22f1103"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
Expand Down
14 changes: 14 additions & 0 deletions THIRDPARTY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Third Party Licenses

HDF5.jl contains several derivative works of open source software.

In particular, the following submodules are licensed as derivative works from third-parties.
Original and derivative code in HDF5.jl is licensed according to [LICENSE.txt](LICENSE.txt)
as permitted by licenses for the original software from which they may be derived.
See the files indicated below for the copyright notices and the licenses of the original
software from which individual submodules are derived.

## Filter Plugins
* [H5Zbzip2](src/filters/H5Zbzip2/src/H5Zbzip2.jl): See [src/filters/H5Zbzip2/THIRDPARTY.txt](src/filters/H5Zbzip2/THIRDPARTY.txt)
* [H5Zlz4](src/filters/H5Zlz4/src/H5Zlz4.jl): See [src/filters/H5Zlz4/THIRDPARTY.txt](src/filters/H5Zlz4/THIRDPARTY.txt)
* [H5Zzstd](src/filters/H5Zzstd/src/H5Zzstd.jl): See [src/filters/H5Zzstd/THIRDPARTY.txt](src/filters/H5Zzstd/THIRDPARTY.txt)
1 change: 1 addition & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,7 @@ contiguously.
A = rand(100,100)
g1["A", chunk=(5,5), compress=3] = A
g2["A", chunk=(5,5), shuffle=(), deflate=3] = A
using H5Zblosc # load in Blosc
g3["A", chunk=(5,5), blosc=3] = A
```

Expand Down
6 changes: 4 additions & 2 deletions src/HDF5.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1600,8 +1600,6 @@ function __init__()
ENV["HDF5_USE_FILE_LOCKING"] = "FALSE"
end

Filters.register_blosc()

# use our own error handling machinery (i.e. turn off automatic error printing)
API.h5e_set_auto(API.H5E_DEFAULT, C_NULL, C_NULL)

Expand All @@ -1614,6 +1612,10 @@ function __init__()
UTF8_ATTRIBUTE_PROPERTIES.char_encoding = :utf8

@require FileIO="5789e2e9-d7fb-5bc7-8068-2c6fae9b9549" @eval include("fileio.jl")
@require H5Zblosc="c8ec2601-a99c-407f-b158-e79c03c2f5f7" @eval begin
set_blosc!(p::Properties, val::Bool) = val && push!(Filters.FilterPipeline(p), H5Zblosc.BloscFilter())
set_blosc!(p::Properties, level::Integer) = push!(Filters.FilterPipeline(p), H5Zblosc.BloscFilter(level=level))
end

return nothing
end
Expand Down
17 changes: 17 additions & 0 deletions src/filters/H5Zblosc/LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
The MIT License (MIT)
Copyright (c) 2012-2021: Timothy E. Holy, Simon Kornblith, and contributors: https://github.com/JuliaIO/HDF5.jl/contributors

Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or
substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
12 changes: 12 additions & 0 deletions src/filters/H5Zblosc/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
name = "H5Zblosc"
uuid = "c8ec2601-a99c-407f-b158-e79c03c2f5f7"
version = "0.1.0"

[deps]
Blosc = "a74b3585-a348-5f62-a45c-50e91977d574"
HDF5 = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"

[compat]
HDF5 = "0.16"
Blosc = "0.7.1"
julia = "1.3"
51 changes: 30 additions & 21 deletions src/filters/blosc.jl → src/filters/H5Zblosc/src/H5Zblosc.jl
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
module H5Zblosc
# port of https://github.com/Blosc/c-blosc/blob/3a668dcc9f61ad22b5c0a0ab45fe8dad387277fd/hdf5/blosc_filter.c (copyright 2010 Francesc Alted, license: MIT/expat)

import Blosc
using HDF5.API
import HDF5.Filters: Filter, FilterPipeline
import HDF5.Filters: filterid, register_filter, filtername, filter_func, filter_cfunc, set_local_func, set_local_cfunc

const FILTER_BLOSC = API.H5Z_filter_t(32001) # Filter ID registered with the HDF Group for Blosc
export H5Z_FILTER_BLOSC, blosc_filter, BloscFilter


const H5Z_FILTER_BLOSC = API.H5Z_filter_t(32001) # Filter ID registered with the HDF Group for Blosc
const FILTER_BLOSC_VERSION = 2
const blosc_name = "blosc"

Expand All @@ -12,7 +19,7 @@ function blosc_set_local(dcpl::API.hid_t, htype::API.hid_t, space::API.hid_t)
blosc_nelements = Ref{Csize_t}(length(blosc_values))
blosc_chunkdims = Vector{API.hsize_t}(undef,32)

API.h5p_get_filter_by_id(dcpl, FILTER_BLOSC, blosc_flags, blosc_nelements, blosc_values, 0, C_NULL, C_NULL)
API.h5p_get_filter_by_id(dcpl, H5Z_FILTER_BLOSC, blosc_flags, blosc_nelements, blosc_values, 0, C_NULL, C_NULL)
flags = blosc_flags[]

nelements = max(blosc_nelements[], 4) # First 4 slots reserved
Expand Down Expand Up @@ -45,7 +52,7 @@ function blosc_set_local(dcpl::API.hid_t, htype::API.hid_t, space::API.hid_t)
blosc_values[3] = basetypesize
blosc_values[4] = chunksize * htypesize # size of the chunk

API.h5p_modify_filter(dcpl, FILTER_BLOSC, flags, nelements, blosc_values)
API.h5p_modify_filter(dcpl, H5Z_FILTER_BLOSC, flags, nelements, blosc_values)

return API.herr_t(1)
end
Expand Down Expand Up @@ -85,10 +92,13 @@ function blosc_filter(flags::Cuint, cd_nelmts::Csize_t,
# uncompressed chunk size but it should not be used in a general
# cases since other filters in the pipeline can modify the buffer
# size.
outbuf_size, cbytes, blocksize = Blosc.cbuffer_sizes(unsafe_load(buf))
in = unsafe_load(buf)
# See https://github.com/JuliaLang/julia/issues/43402
# Resolved in https://github.com/JuliaLang/julia/pull/43408
outbuf_size, cbytes, blocksize = Blosc.cbuffer_sizes(in)
outbuf = Libc.malloc(outbuf_size)
outbuf == C_NULL && return Csize_t(0)
status = Blosc.blosc_decompress(unsafe_load(buf), outbuf, outbuf_size)
status = Blosc.blosc_decompress(in, outbuf, outbuf_size)
status <= 0 && (Libc.free(outbuf); return Csize_t(0))
end

Expand All @@ -102,19 +112,6 @@ function blosc_filter(flags::Cuint, cd_nelmts::Csize_t,
return Csize_t(0)
end


# register the Blosc filter function with HDF5
function register_blosc()
c_blosc_set_local = @cfunction(blosc_set_local, API.herr_t, (API.hid_t,API.hid_t,API.hid_t))
c_blosc_filter = @cfunction(blosc_filter, Csize_t,
(Cuint, Csize_t, Ptr{Cuint}, Csize_t,
Ptr{Csize_t}, Ptr{Ptr{Cvoid}}))
API.h5z_register(API.H5Z_class_t(API.H5Z_CLASS_T_VERS, FILTER_BLOSC, 1, 1, pointer(blosc_name), C_NULL, c_blosc_set_local, c_blosc_filter))

return nothing
end


"""
BloscFilter(;level=5, shuffle=true, compressor="blosclz")

Expand Down Expand Up @@ -142,6 +139,15 @@ function BloscFilter(;level=5, shuffle=true, compressor="blosclz")
BloscFilter(0,0,0,0,level,shuffle,compcode)
end

filterid(::Type{BloscFilter}) = H5Z_FILTER_BLOSC
filtername(::Type{BloscFilter}) = blosc_name
set_local_func(::Type{BloscFilter}) = blosc_set_local
set_local_cfunc(::Type{BloscFilter}) = @cfunction(blosc_set_local, API.herr_t, (API.hid_t,API.hid_t,API.hid_t))
filter_func(::Type{BloscFilter}) = blosc_filter
filter_cfunc(::Type{BloscFilter}) = @cfunction(blosc_filter, Csize_t,
(Cuint, Csize_t, Ptr{Cuint}, Csize_t,
Ptr{Csize_t}, Ptr{Ptr{Cvoid}}))

function Base.show(io::IO, blosc::BloscFilter)
print(io, BloscFilter,
"(level=", Int(blosc.level),
Expand All @@ -150,9 +156,6 @@ function Base.show(io::IO, blosc::BloscFilter)
")")
end

filterid(::Type{BloscFilter}) = FILTER_BLOSC
FILTERS[FILTER_BLOSC] = BloscFilter

function Base.push!(f::FilterPipeline, blosc::BloscFilter)
0 <= blosc.level <= 9 || throw(ArgumentError("blosc compression $(blosc.level) not in [0,9]"))
ref = Ref(blosc)
Expand All @@ -161,3 +164,9 @@ function Base.push!(f::FilterPipeline, blosc::BloscFilter)
end
return f
end

function __init__()
register_filter(BloscFilter)
end

end # module H5Zblosc
17 changes: 17 additions & 0 deletions src/filters/H5Zbzip2/LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
The MIT License (MIT)
Copyright (c) 2012-2021: Timothy E. Holy, Simon Kornblith, and contributors: https://github.com/JuliaIO/HDF5.jl/contributors

Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or
substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
12 changes: 12 additions & 0 deletions src/filters/H5Zbzip2/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
name = "H5Zbzip2"
uuid = "094576f2-1e46-4c84-8e32-c46c042eaaa2"
version = "0.1.0"

[deps]
CodecBzip2 = "523fee87-0ab8-5b00-afb7-3ecf72e48cfd"
HDF5 = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"

[compat]
HDF5 = "0.16"
CodecBzip2 = "0.7"
julia = "1.3"
31 changes: 31 additions & 0 deletions src/filters/H5Zbzip2/THIRDPARTY.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
H5Z_filter_bzip2 in H5Zbzip2.jl was derived from H5Zbzip2.c from PyTables:

Copyright Notice and Statement for PyTables Software Library and Utilities:
Copyright (c) 2002-2004 by Francesc Alted
Copyright (c) 2005-2007 by Carabos Coop. V.
Copyright (c) 2008-2010 by Francesc Alted
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
a. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
b. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the
distribution.
c. Neither the name of Francesc Alted nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Loading