Skip to content
This repository has been archived by the owner on Sep 1, 2022. It is now read-only.

add non time dependent variable to FMRC #1060

Closed
marceloandrioni opened this issue Mar 16, 2018 · 1 comment
Closed

add non time dependent variable to FMRC #1060

marceloandrioni opened this issue Mar 16, 2018 · 1 comment

Comments

@marceloandrioni
Copy link

marceloandrioni commented Mar 16, 2018

Hello, I have a FMRC aggregation of several yearly files with the usual variables (ssh,t,s,u,v), e.g.

<featureCollection name="mercator" featureType="FMRC" harvest="true" path="mercator">
<collection spec="/home/opendap/datasets/mercator/**/mercator_#yyyy#\.nc"/>
<update startup="true"/>
<netcdf xmlns="http:https://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
</netcdf>
<fmrcConfig regularize="false" datasetTypes="Best Files" />
</featureCollection>

I have a different file with non time dependent variables (bathymetry, sea land mask, etc) and would like to have these variables also appear in the FMRC. I tried just renaming the file to mercator_1970.nc hoping that the <collection spec="/home/opendap/datasets/mercator/**/mercator_#yyyy#\.nc"/> would catch it, but it didn't work. My guess is that doesn't work because there is no time dimension in this file and time is necessary for FMRC aggregation.

Besides appending the non time dependent variables to every single file with
ncks -A bathymetry_mask.nc mercator_YYYY.nc
is there a way using ncml to join all the files (with and without time dimensions) in the FMRC?
Also, I am adamant in using FMRC because my tests showed that is a lot faster to retrieve data than using aggregation joinNew or joinExisting.

Sorry if this is not the correct place to ask ncml related questions, but I couldn't find a specific repository for ncml.

Thank you.

@marceloandrioni
Copy link
Author

I got an answer for this problem from @cofinoa in a different forum, so I am including his answer here (in case someone faces a similar problem) and closing the Issue.

Hi,

Hello, I have a FMRC aggregation of several yearly files with the usual variables (ssh,t,s,u,v), e.g.

<featureCollection name="mercator" featureType="FMRC" harvest="true" path="mercator"> <collection spec="/home/opendap/datasets/mercator/**/mercator_#yyyy#\.nc"/> <update startup="true"/> <netcdf xmlns="http:https://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> </netcdf> <fmrcConfig regularize="false" datasetTypes="Best Files" /> </featureCollection>

I have a different file with non time dependent variables (bathymetry, sea land mask, etc) and would like to have these variables also appear in the FMRC. I tried just renaming the file to mercator_1970.nc hoping that the would catch it, but it didn't work. My guess is that doesn't work because there is no time dimension in this file and time is necessary for FMRC aggregation.

No. It's because the prototype dataset been chosen by the featureCollection doesn't contain those variables. By default the featureCollection choose the penultimate dataset/file to build-up the resulting dataset with all dataset/files in the collection.

Besides appending the non time dependent variables to every single file with
ncks -A bathymetry_mask.nc mercator_YYYY.nc
is there a way using ncml to join all the files (with and without time dimensions) in the FMRC?

You can change the default prototype dataset by:
<featureCollection name="mercator" featureType="FMRC" harvest="true" path="mercator"> <collection spec="/home/opendap/datasets/mercator/**/mercator_#yyyy#\.nc"/> <update startup="true"/> <protoDataset choice="First"/> <fmrcConfig regularize="false" datasetTypes="Best Files" /> </featureCollection>

and append the missing variables to the first dataset/file in the collection, i.e.
ncks -A bathymetry_mask.nc mercator_1970.nc

Then they should be "appear" in the resulting dataset. From the doc [1] you can read
....
The choice of the protoDataset matters when the datasets are not homogenous:

Global and variable attributes are taken from the prototype dataset.
If a variable appears in the prototype dataset, it will appear in the feature collection dataset. If it doesnt appear in other datasets, it will have missing data for those times.
If a variable does not appears in the prototype dataset, it will not appear in the feature collection dataset, even if it appears in other datasets.

....

Also, I am adamant in using FMRC because my tests showed that is a lot faster to retrieve data than using aggregation joinNew or joinExisting.

In my case I'm reluctant to use FMRC because you have less control on building the resulting dataset. With pure ncml and aggregations you have better control on resulting dataset and good performance if you control cache and definition of the coordinate values.

Regards

Antonio

[1] https://www.unidata.ucar.edu/software/thredds/v4.6/tds/reference/collections/FeatureCollections.html#elements

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant