Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand attributes in metrics output json file: creation date, tracking id and climfilename.nc #50

Closed
gleckler1 opened this issue Aug 4, 2014 · 16 comments · Fixed by #119

Comments

@gleckler1
Copy link
Contributor

For each variable/json file ...

Add to both models (under "SimulationDescription") and observations the following keys (both of which can be obtained from the netCDF files...

  1. tracking_id
  2. creation_date
  3. the name of the .nc file climatology
@doutriaux1
Copy link
Contributor

@gleckler1 you mean the name of climatology file used to compute the metrics?

@gleckler1
Copy link
Contributor Author

Yes, the name of the climatology used to compute the metrics!

@durack1
Copy link
Collaborator

durack1 commented Aug 13, 2014

so NOAA-OISST-v2 (tos) or GPCP (pr) etc etc

@durack1
Copy link
Collaborator

durack1 commented Sep 12, 2014

@gleckler1 if you can clearly define what needs to be done here, I can implement the changes and close this up - it's not dependent on @doutriaux1 or UV-CDAT changes..

The md5sums in an updated observation json dictionary should also be done

@doutriaux1
Copy link
Contributor

@durack1 , look at src/python/pcmdi/scripts there is a script in there that @gleckler1 wrote that generates the obs dict. This is where it needs to sit eventually.

@durack1
Copy link
Collaborator

durack1 commented Sep 16, 2014

@durack1
Copy link
Collaborator

durack1 commented Sep 25, 2014

@gleckler1 is this done - is the issue close-able?

@durack1
Copy link
Collaborator

durack1 commented Sep 25, 2014

Following the 24th Sept 2014 email to GFDL the following improvements were listed:
Fine tune the metadata in json files and file versions (json example attached). We may seek feedback from the WIP but would welcome your input.

@gleckler1
Copy link
Contributor Author

Can we make this one a high priority? I've looked at it, but right now this one is too hard for me... the hope is that each metrics json file, for each model will include the following:

  1. The filename.nc of the model climatology
  2. md5sum of the model climatology
  3. creation_date

@durack1
Copy link
Collaborator

durack1 commented Oct 17, 2014

@gleckler1 so to clarify, what you're asking for is the files that are being interrogated to create the results in the CMIP_results/CMIP5/historical *.json also contain the 3 attributes above? Which file that you run (and is in the repo) creates these jsons?

@gleckler1
Copy link
Contributor Author

Don't quite follow you but I think yes... the metrics json files should include of for each model/climatology these 3 attributes... e.g.
{
"ACCESS1-0": {
"SimulationDescription": {
"Center": "N/A",
"Experiment": "historical",
!!!! "ClimatologyFilename': filename.nc,
!!!! "Md5sum of climatology file": MD5results,
!!!! "creation_date": creation_data
"Login": "N/A",
"MIPTable": "Amon",
"Model": "ACCESS1-0",
"ModelActivity": "CMIP5",
"ModelFreeSpace": "N/A",
"ModellingGroup": "CSIRO-BOM",
"SimName": "r1i1p1",
"SimTrackingDate": "2012-01-15T10:45:49Z"
},
Actually SimTrackingDate I think == creation_date, so there are only 2 new items here but both involve operating on the file object on the fly and I don't fully understand the object in the driver to do this myself

@durack1
Copy link
Collaborator

durack1 commented Oct 17, 2014

@gleckler1 are you talking about the files that we provide (so those in CMIP_results), or the output files that are created by the GFDL, NCAR or whoever folks?

This only makes sense to me for us to do this, as we have the CF-compliant CMIP5 data that we're using to create the CMIP3/5 metrics benchmarks stored in CMIP_results..

@durack1
Copy link
Collaborator

durack1 commented Oct 17, 2014

If the first one (files that @gleckler1 creates), where is that script? Is it in the repo?

@doutriaux1
Copy link
Contributor

creation date is simtrackingdate there's an alias for that, we can take the alias out, it was originally designed after Jerome's recommendation as it was deemed ok for a first pass.

@doutriaux1 doutriaux1 self-assigned this Oct 17, 2014
@durack1
Copy link
Collaborator

durack1 commented Oct 18, 2014

@gleckler1 the md5sum of the climatology file is kinda arbitrary, as we are creating this climatology from the CMIP5 data that we have stored locally - would it be more useful to include more specific information which is obtained from the source files from which the climatology has been generated?

We could include both pieces of information, source files (filename, creation_date, tracking_id) and also info about the derived climatology file (filenameandpath, clim_creation_date - which will be dependent on whom created the file and whether they wrote suitable attributes)

@durack1 durack1 added this to the 1.0 - initial release milestone Oct 18, 2014
@gleckler1
Copy link
Contributor Author

UPDATE
all that is needed now is attributes for model climatology filename.nc and md5sum of that climatology:
climatology_md5sum
climatology_filename

I am taking care of all other attributes (CMIP5_creation_date and CMIP5_tracking_id)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants