Skip to content
This repository has been archived by the owner on Sep 1, 2022. It is now read-only.

NetCDF-Java applies netCDF formula for packed data (scale/offset) to HDF-EOS data #269

Open
ethanrd opened this issue Oct 30, 2015 · 2 comments

Comments

@ethanrd
Copy link
Member

ethanrd commented Oct 30, 2015

While HDF-EOS and netCDF both use scale_factor and add_offset attributes to describe how data has been packed, they do not use the same (un)packing formula. (Actually, it is not clear that all HDF and HDF-EOS files use the same (un)packing formuals (see Note 2 below).)

The formula to unpack netCDF packed data (as described in the "NetCDF Best Practices" document) is

unpacked = scale_factor*packed + add_offset

Where as the formula to unpack HDF-EOS packed data (see Note 2 below) is

unpacked = scale_factor*(packed - add_offset)

An example dataset that illustrates this problem is here

ftp:https://ladsweb.nascom.nasa.gov/allData/51/MOD08_D3/2001/002/MOD08_D3.A2001002.051.2010286150655.hdf

The problem can easily be seen by looking at the variable Cloud_Top_Temperature_Day_Maximum. Using the ToolsUI Grid Viewer one can quickly see values around -14900 degrees Kelvin.

short Cloud_Top_Temperature_Day_Maximum(YDim=180, XDim=360);
:units = "Degrees Kelvin";
:scale_factor = 0.01; // double
:add_offset = -15000.0; // double
:valid_range = 0S, 20000S; // short
:_FillValue = -9999S; // short

Using 11225 as the packed value (from index [30,0]) and the scale/offset values above:

  • the netCDF formula gives -14887.75
  • the HDF-EOS formula gives 262.25

Interestingly, when accessed via OPeNDAP from a Hyrax server the scale/offset values are adjusted for the netCDF formula:

https://ladsweb.nascom.nasa.gov/opendap/allData/51/MOD08_D3/2001/002/MOD08_D3.A2001002.051.2010286150655.hdf

Cloud_Top_Temperature_Day_Maximum {
Int16 valid_range 0, 20000;
Int16 _FillValue -9999;
String units "Degrees Kelvin";
Float64 scale_factor 0.010000000000000000;
Float64 add_offset 150.00000000000000;

With these values and the packed value used above:

  • the netCDF formula gives 262.25
  • the HDF-EOS formula gives 110.75

Note 1

Thanks to Chris Lynnes, who pointed out this problem.

Note 2

I have not found a definitive statement describing the HDF-EOS formula for packed data. The NCAR NCL page on HDF mentions both packing formulas (see the "NCL General Comments" section). The NCO documentation mentions that "[m]ost files originally written in HDF format use the HDF packing/unpacking algorithm" (and references some HDF5 documentation on packed data) but NCO defaults to netCDF (un)packing.

@JohnLCaron
Copy link
Collaborator

we can correct this in the HDF-EOS code if we are sure what to do.

On Fri, Oct 30, 2015 at 5:04 PM, Ethan Davis [email protected]
wrote:

While HDF-EOS and netCDF both use scale_factor and add_offset attributes
to describe how data has been packed, they do not use the same (un)packing
formula. (Actually, it is not clear that all HDF and HDF-EOS files use the
same (un)packing formuals (see Note 2 https://note-2 below).)

The formula to unpack netCDF packed data (as described in the "NetCDF Best
Practices" document
https://www.unidata.ucar.edu/software/netcdf/docs/BestPractices.html#Packed%20Data%20Values)
is

unpacked = scale_factor*packed + add_offset

Where as the formula to unpack HDF-EOS packed data (see Note 2
https://note-2 below) is

unpacked = scale_factor*(packed - add_offset)

An example dataset that illustrates this problem is here

ftp:https://ladsweb.nascom.nasa.gov/allData/51/MOD08_D3/2001/002/MOD08_D3.A2001002.051.2010286150655.hdf

The problem can easily be seen by looking at the variable
Cloud_Top_Temperature_Day_Maximum. Using the ToolsUI Grid Viewer one can
quickly see values around -14900 degrees Kelvin.

short Cloud_Top_Temperature_Day_Maximum(YDim=180, XDim=360);
:units = "Degrees Kelvin";
:scale_factor = 0.01; // double
:add_offset = -15000.0; // double
:valid_range = 0S, 20000S; // short
:_FillValue = -9999S; // short

Using 11225 as the packed value (from index [30,0]) and the scale/offset
values above:

  • the netCDF formula gives -14887.75
  • the HDF-EOS formula gives 262.25

Interestingly, when accessed via OPeNDAP from a Hyrax server the
scale/offset values are adjusted for the netCDF formula:

https://ladsweb.nascom.nasa.gov/opendap/allData/51/MOD08_D3/2001/002/MOD08_D3.A2001002.051.2010286150655.hdf

Cloud_Top_Temperature_Day_Maximum {
Int16 valid_range 0, 20000;
Int16 _FillValue -9999;
String units "Degrees Kelvin";
Float64 scale_factor 0.010000000000000000;
Float64 add_offset 150.00000000000000;

With these values and the packed value used above:

  • the netCDF formula gives 262.25
  • the HDF-EOS formula gives 110.75

Note 1

Thanks to Chris Lynnes, who pointed out this problem.
Note 2

I have not found a definitive statement describing the HDF-EOS formula for
packed data. The NCAR NCL page on HDF
https://www.ncl.ucar.edu/Applications/HDF.shtml mentions both packing
formulas (see the "NCL General Comments" section). The NCO documentation
https://nco.sourceforge.net/nco.html#hdf_upk mentions that "[m]ost files
originally written in HDF format use the HDF packing/unpacking algorithm"
(and references some HDF5 documentation
https://www.hdfgroup.org/HDF5/doc/UG/UG_frame10Datasets.html on packed
data) but NCO defaults to netCDF (un)packing.


Reply to this email directly or view it on GitHub
#269.

@ethanrd
Copy link
Member Author

ethanrd commented Nov 20, 2015

That's what I figured. Its the "being sure what to do" that may be the problem.

I would think if we can tell that a file is HDF (not nc4) and especially HDF-EOS, we should use the HDF-EOS formula. However, there are at least two problems. The first problem comes with remote access. In particular, Hyrax transforms scale/offset metadata while TDS does not. The second is that there are efforts to "harmonize" these two standards. That ones probably not as big of a deal. Three is that it doesn't sound like this is necessarily handled consistently in HDF-land.

So, for now, maybe defaulting to netCDF scale/offset handling but allowing the user to specify if they want HDF scale/offset handling.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants