Skip to content
This repository has been archived by the owner on Sep 1, 2022. It is now read-only.

NCML Time axis aggregation with missing / time gaps #1361

Closed
Akshay-Hegde opened this issue May 15, 2021 · 6 comments
Closed

NCML Time axis aggregation with missing / time gaps #1361

Akshay-Hegde opened this issue May 15, 2021 · 6 comments

Comments

@Akshay-Hegde
Copy link

Akshay-Hegde commented May 15, 2021

Hi I got few data files, which is hourly data, but there exists gap

( Across multiple files )
Time Min and Max are : 16-NOV-2010 06:00 to 22-OCT-2019 10:00

Depth Min and Max are : -104 to 1044.5

  1. Wanted to explore is there any way to generate time and depth axis dynamically based on files Time and depth axis min and max value, and re-grid variables inside.
  2. How missing values can be filled for these gaps ?

I read few available resource online such as logical view etc, but no luck so far no success.

Content of my ncml

$ cat test.ncml 
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="https://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
  <aggregation dimName="Time" type="joinExisting">
          <scan location="/path/to/test-ncml" regExp="f*\.nc$" subdirs="true" timeUnitsChange="true"/>
  </aggregation>
</netcdf>

Sample

$ ncdump -h sample.nc 
netcdf sample {
dimensions:
	Lon = 1 ;
	Lat = 1 ;
	Depth = 62 ;
	Time = 8214 ;
variables:
	float Lon(Lon) ;
		Lon:long_name = "Longitude" ;
		Lon:units = "Degree_East" ;
	float Lat(Lat) ;
		Lat:long_name = "Latitude" ;
		Lat:units = "Degree_Nort" ;
	float Depth(Depth) ;
		Depth:long_name = "Depth (m)" ;
		Depth:units = "meters" ;
		Depth:bin_size = 8. ;
		Depth:Center_first_bin = 17.7600002288818 ;
		Depth:blanking_distance = 7.03999996185303 ;
	float Time(Time) ;
		Time:long_name = "Time" ;
		Time:units = "hours" ;
		Time:time_origin = "16-NOV-2010 06:00:00" ;
	float u_1205(Time, Depth, Lat, Lon) ;
		u_1205:name = "u" ;
		u_1205:long_name = "Eastward Velocity" ;
		u_1205:missing_value = 99999.f ;
		u_1205:_FillValue = 1.e+35f ;
		u_1205:units = "cm/s" ;
	float v_1206(Time, Depth, Lat, Lon) ;
		v_1206:name = "v" ;
		v_1206:long_name = "Northward Velocity" ;
		v_1206:missing_value = 99999.f ;
		v_1206:_FillValue = 1.e+35f ;
		v_1206:units = "cm/s" ;
}
@lesserwhirls
Copy link
Collaborator

Unfortunately, this scenario is far too complex for NcML and will require writing some custom code join these together using the attributes in the way you describe.

@lesserwhirls
Copy link
Collaborator

It might be worth reaching out the the general netCDF users email list to see if anyone has done a similar task in the past: [email protected]

@Akshay-Hegde
Copy link
Author

Akshay-Hegde commented Jun 3, 2021

Unfortunately, this scenario is far too complex for NcML and will require writing some custom code join these together using the attributes in the way you describe.

Thank you as you said it seems impossible to do this in NCML, however I can do this using Ferret, cdo, matlab etc.
But wanted to discover possibilities especially dynamically as variables are same across files, time and depth axis Min and Max values are known.

Is there any document on NCML ? Official website docs seems not updated.

@cofinoa
Copy link
Contributor

cofinoa commented Jun 3, 2021

Hi, @Akshay-Hegde

I have not understand your scenario. Do you have a set of samples input nc files? what is the expected result for that sample?

Regards

@cofinoa
Copy link
Contributor

cofinoa commented Jun 3, 2021

.... what means re-gridding, interpolation?

@Akshay-Hegde
Copy link
Author

Akshay-Hegde commented Jun 4, 2021

Hi, @Akshay-Hegde

I have not understand your scenario. Do you have a set of samples input nc files? what is the expected result for that sample?

Regards

@cofinoa

Hi as you can see below got f1.nc to f5.nc total 5 files spanning from 2012-10-13 10:00:00 to 2019-10-22 08:40:00. But in between there are some data gaps as observation did not take place during this missing period. All these files belongs to single location so latitude and longitude dimension length is just 1.

datetime-start, datetime-end, file
2012-10-13 10:00:00, 2013-11-24 08:00:00, f1.nc  
            --- here data missing -- between 2013-11-24 08:01:00 to 2013-11-24 00:59:00
2013-11-24 13:00:00, 2014-11-15 15:00:00, f2.nc  
            --- here data missing -- between 2014-11-15 15:01:00 to 2016-11-16 00:59:00
2016-11-16 13:00:00, 2017-10-08 08:00:00, f3.nc  
            --- again here
2017-10-08 12:20:00, 2018-10-10 08:20:00, f4.nc  
            --- again here
2018-10-10 14:40:00, 2019-10-22 08:40:00, f5.nc  

.... what means re-gridding, interpolation?

Yes

Also these files time axis are like this

# for f1.nc  
float Time(Time) ;
		Time:long_name = "Time" ;
		Time:units = "hours" ;
		Time:time_origin = "13-OCT-2012 10:00:00" ;

# for f2.nc  
float Time(Time) ;
		Time:long_name = "Time" ;
		Time:units = "hours" ;
		Time:time_origin = "24-NOV-2013 13:00:00" ;

  • Is it possible to combine all file it to single file with time axis from 2012-10-13 10:00:00 to 2019-10-22 08:40:00 and then merge all variables within it ?
  • Is it possible to convert existing Time variable value relative to hours since 01-JAN-1970 00:00:00 ?
$ ncdump -h f1.nc | grep 'Time = '
	Time = 9767 ;
$ ncdump -h f2.nc | grep 'Time = '
	Time = 8547 ;
$ ncdump -h f3.nc | grep 'Time = '
	Time = 7820 ;


netcdf file:test.ncml {
  dimensions:
    Lon = 1;
    Lat = 1;
    Depth = 63;
    Time = 26134;           /*  Here I need available plus missing hours */


 float Time(Time=26134);
      :long_name = "Time";
      :units = "hours";
      :time_origin = "13-OCT-2012 10:00:00"; /* Here I need  hours since 01-JAN-1970 00:00:00 */

<netcdf xmlns="https://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
   <aggregation dimName="Time" type="joinExisting">
              
            <netcdf location="f1.nc" />
            <!-- Missing Here
                 How to define virtual dataset and fill with missing values 
                 between 2013-11-24 09:00:00 to 2013-11-24 00:00:00
            -->
            <netcdf location="f2.nc" />

            <!--- missing Here
                between 2014-11-15 16:00:00 to 2016-11-16 00:00:00
            -->

      
            <!-- can we convert these files time axis values relative to hours since 1970-JAN-01 00:00:00 -->
            <netcdf location="f3.nc" />

   </aggregation>
</netcdf>

@cofinoa Please find sample data : Google Drive

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants