Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

field metadata of a loaded netcdf variable is incomplete. #22

Closed
Datseris opened this issue Jan 21, 2020 · 5 comments · Fixed by #29
Closed

field metadata of a loaded netcdf variable is incomplete. #22

Datseris opened this issue Jan 21, 2020 · 5 comments · Fixed by #29

Comments

@Datseris
Copy link

Datseris commented Jan 21, 2020

It seems to me that at the moment NCDstack loads as metadata the global attributes of the .nc file. However this metadata is the only thing propagated to a loaded variable from the file. For example, I have:

TOA = NCDstack(datadir("CERES", "CERES_EBAF_TOA.nc"))
TOA2 = NCDataset(datadir("CERES", "CERES_EBAF_TOA.nc"))

julia> TOA.metadata
NCDmetadata{String,String} with 6 entries:
  "DOI"         => "10.5067/TERRA-AQUA/CERES/EBAF_L3B004.1"
  "title"       => "CERES EBAF TOA and Surface Fluxes. Monthly Averages and 07/2005 to 06/201…  "institution" => "NASA Langley Research Center"
  "version"     => "Edition 4.1; Release Date May 28, 2019"
  "comment"     => "Climatology from 07/2005 to 06/2015"
  "Conventions" => "CF-1.4"

julia> TOA2.attrib
  title                = CERES EBAF TOA and Surface Fluxes. Monthly Averages and 07/2005 to 06/2015 Climatology.
  institution          = NASA Langley Research Center
  Conventions          = CF-1.4
  comment              = Climatology from 07/2005 to 06/2015
  version              = Edition 4.1; Release Date May 28, 2019
  DOI                  = 10.5067/TERRA-AQUA/CERES/EBAF_L3B004.1

both packages correctly list the global attributes as "metadata". However,

julia> TOA["toa_sw_all_mon"].metadata
NCDmetadata{String,String} with 6 entries:
  "DOI"         => "10.5067/TERRA-AQUA/CERES/EBAF_L3B004.1"
  "title"       => "CERES EBAF TOA and Surface Fluxes. Monthly Averages and 07/2005 to 06/201…  "institution" => "NASA Langley Research Center"
  "version"     => "Edition 4.1; Release Date May 28, 2019"
  "comment"     => "Climatology from 07/2005 to 06/2015"
  "Conventions" => "CF-1.4"

julia> TOA2["toa_sw_all_mon"]
toa_sw_all_mon (360 × 180 × 231)
  Datatype:    Float32
  Dimensions:  lon × lat × time
  Attributes:
   long_name            = Top of The Atmosphere Shortwave Flux, All-Sky conditions, Monthly Means
   standard_name        = TOA Shortwave Flux - All-Sky
   CF_name              = toa_outgoing_shortwave_flux
   comment              = none
   units                = W m-2
   valid_min            =       0.00000
   valid_max            =       600.000
   _FillValue           = -999.0

You can clearly see that the NCDatasets.jl version has different "metadata", the most important being by far the long_name, which should be listed in the metadata of the loaded field from GeoData.jl as well (as it is quite important for NCstacks with 10s of variables).

@Balinus
Copy link

Balinus commented Jan 21, 2020

the most important being by far the long_name, which should be listed in the metadata of the loaded field from GeoData.jl as well (as it is quite important for NCstacks with 10s of variables).

I'd say that the most important is standard_name since it is standardized. See http:https://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#long-name for instance.

Propagating variable attributes is indeed of great importance.

Cheers!

edit - See here also http:https://cfconventions.org/Data/cf-standard-names/28/build/cf-standard-name-table.html

@rafaqz
Copy link
Owner

rafaqz commented Feb 13, 2020

Sorry somehow I bumped off notifications for this package.

Metadata needs some work, it's a pretty rudimentary first pass at this stage. It looks dataset metadata is being used instead of the var metadata. Which isn't the right thing to do. It should get the var metadata like the dimension.

https://github.com/rafaqz/GeoData.jl/blob/master/src/sources/ncdatasets.jl#L217

But we will also need more metadata wrapper types - one for stacks and one for arrays, as they hold different things.

@rafaqz
Copy link
Owner

rafaqz commented Feb 14, 2020

So this might be a little more complicated.

To able to save NCDarray back to a file we need to have both the var and dataset attributes. Maybe the new NCDarrayMetadata can have an additional field to hold a copy of the NCDstackMetadata for the stack/dataset that it came from, which we can use again when we save the array.

All this will be even more fun to handle when we try to save a GDALarray etc as a netcdf. But at least it will be fully formalised what is required to do that.

@rafaqz rafaqz mentioned this issue Feb 14, 2020
@Balinus
Copy link

Balinus commented Feb 14, 2020

You will also need the time attributes to save the data to disk.

@rafaqz
Copy link
Owner

rafaqz commented Feb 14, 2020

Yep, added to the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants