-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NetCDF3 files (64-bit offset NetCDF,...) and unlimited dimensions (issue #231) #232
Conversation
src/variable.jl
Outdated
@@ -401,7 +401,7 @@ end | |||
getchunksize(v::Variable) = getchunksize(haschunks(v),v) | |||
getchunksize(::DiskArrays.Chunked, v::Variable) = chunking(v)[2] | |||
# getchunksize(::DiskArrays.Unchunked, v::Variable) = DiskArrays.estimate_chunksize(v) | |||
getchunksize(::DiskArrays.Unchunked, v::Variable) = size(v) | |||
getchunksize(::DiskArrays.Unchunked, v::Variable) = max.(1,size(v)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to comment the reason for this, its a bit puzzling otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we should add a comment, but while we are at this line. @tcarion , is there any particular reason why the DiskArrays.estimate_chunksize
is commented-out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's because I had this error:
NCDatasets: Error During Test at /home/tcarion/.julia/dev/NCDatasets/test/runtests.jl:10
Got exception outside of a @test
LoadError: MethodError: no method matching DiskArrays.RegularChunks(::Tuple{UnitRange{Int64}, UnitRange{Int64}}, ::Int64, ::Int64)
Closest candidates are:
DiskArrays.RegularChunks(::Integer, ::Integer, ::Integer)
@ DiskArrays ~/.julia/packages/DiskArrays/QlfRF/src/chunks.jl:25
Stacktrace:
[1] (::DiskArrays.var"#24#27")(s::Int64, cs::Tuple{UnitRange{Int64}, UnitRange{Int64}}, of::Int64)
@ DiskArrays ~/.julia/packages/DiskArrays/QlfRF/src/chunks.jl:133
[2] map
@ ./tuple.jl:318 [inlined]
[3] DiskArrays.GridChunks(a::Tuple{Int64, Int64}, chunksize::Tuple{Tuple{UnitRange{Int64}, UnitRange{Int64}}}; offset::Tuple{Int64, Int64})
@ DiskArrays ~/.julia/packages/DiskArrays/QlfRF/src/chunks.jl:132
[4] DiskArrays.GridChunks(a::NCDatasets.Variable{Float64, 2, NCDataset{Nothing}}, chunksize::Tuple{Tuple{UnitRange{Int64}, UnitRange{Int64}}}; offset::Tuple{Int64, Int64})
@ DiskArrays ~/.julia/packages/DiskArrays/QlfRF/src/chunks.jl:130
[5] DiskArrays.GridChunks(a::NCDatasets.Variable{Float64, 2, NCDataset{Nothing}}, chunksize::Tuple{Tuple{UnitRange{Int64}, UnitRange{Int64}}})
@ DiskArrays ~/.julia/packages/DiskArrays/QlfRF/src/chunks.jl:130
[6] eachchunk(v::NCDatasets.Variable{Float64, 2, NCDataset{Nothing}})
@ NCDatasets ~/.julia/dev/NCDatasets/src/variable.jl:408
[7] _readblock!(::NCDatasets.Variable{Float64, 2, NCDataset{Nothing}}, ::Matrix{Float64}, ::UnitRange{Int64}, ::Vararg{AbstractVector})
@ DiskArrays ~/.julia/packages/DiskArrays/QlfRF/src/batchgetindex.jl:168
[8] readblock!
@ ~/.julia/packages/DiskArrays/QlfRF/src/batchgetindex.jl:209 [inlined]
[9] getindex_disk(::NCDatasets.Variable{Float64, 2, NCDataset{Nothing}}, ::UnitRange{Int64}, ::Vararg{Any})
@ DiskArrays ~/.julia/packages/DiskArrays/QlfRF/src/diskarray.jl:40
[10] getindex
@ ~/.julia/packages/DiskArrays/QlfRF/src/diskarray.jl:211 [inlined]
[11] getindex(::CommonDataModel.CFVariable{Float64, 2, NCDatasets.Variable{Float64, 2, NCDataset{Nothing}}, NCDatasets.Attributes{NCDataset{Nothing}}, NamedTuple{(:fillvalue, :missing_values, :scale_factor, :add_offset, :calendar, :time_origin, :time_factor), Tuple{Nothing, Tuple{}, Vararg{Nothing, 5}}}}, ::UnitRange{Int64}, ::StepRange{Int64, Int64})
@ CommonDataModel ~/.julia/packages/CommonDataModel/pO4st/src/cfvariable.jl:390
[12] top-level scope
@ ~/.julia/dev/NCDatasets/test/test_simple.jl:28
[...]
in expression starting at /home/tcarion/.julia/dev/NCDatasets/test/test_simple.jl:28
I found an easy fix by using size
instead, but I didn't think it through and kind of forgot it, that's my bad... I must admit I don't fully know how the chunking should work. But I think the issue comes from the fact getchunksize(::DiskArrays.Unchunked, v::Variable)
should return a Tuple, while DiskArrays.estimate_chunksize(v)
returns a GridChunks
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~RegularChunks
is for a single axis, and it accepts three Integer
s. ~
GridChunks
wraps a Tuple
of RegularChunks
and IrregularChunks
, one for each axis - selected depending on the underlying chunking of the disk array.
Ah sorry misunderstood the confusion here. Normally estimate_chunksize
is called from eachchunk
. You could refactor getchunks
to return GridChunks
instead of the size.
You may want to call estimate_chunksize
when there are no unlimited dimensions, and otherwise put a 1
in the unlimited dimension and use the size for all the other dimensions.
@@ -401,7 +401,7 @@ end | |||
getchunksize(v::Variable) = getchunksize(haschunks(v),v) | |||
getchunksize(::DiskArrays.Chunked, v::Variable) = chunking(v)[2] | |||
# getchunksize(::DiskArrays.Unchunked, v::Variable) = DiskArrays.estimate_chunksize(v) | |||
getchunksize(::DiskArrays.Unchunked, v::Variable) = size(v) | |||
getchunksize(::DiskArrays.Unchunked, v::Variable) = max.(1,size(v)) | |||
eachchunk(v::CFVariable) = eachchunk(v.var) | |||
haschunks(v::CFVariable) = haschunks(v.var) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify this declaration on CFVariable
doesn't really do much, as it doesn't define the necessary broadcast or iteration methods to actually call this in real use.
I changed the PR quite a bit. It tests now explicitly for NetCDF3 files to determine the chunksize (Unidata/netcdf-c#2224) also we use |
Try to address issue #231.