Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open multiple files using xcdat #926

Merged
merged 6 commits into from
Apr 27, 2023
Merged

Conversation

lee1043
Copy link
Contributor

@lee1043 lee1043 commented Apr 26, 2023

  • Update xcdat version
  • Use xcdat open_mfdataset instead of xcdat_open that was defined locally in PMP
    • This update allows use of wildcard ('*' or '?') as a part of string giving path of input files.
    • E.g., modpath="/MY/FILES/*.nc"

@lee1043 lee1043 added this to the 3.1 milestone Apr 26, 2023
@lee1043 lee1043 self-assigned this Apr 26, 2023
@lee1043 lee1043 linked an issue Apr 26, 2023 that may be closed by this pull request
@acordonez
Copy link
Collaborator

I'll give this a test run today

@lee1043
Copy link
Contributor Author

lee1043 commented Apr 26, 2023

@tomvothecoder As discussed at the xcdat meeting, CI/CD build test is failing and with the log that complains it cannot find xcdat 0.5.0. Could you please take a look?

Set up Conda Environment

Run conda-incubator/setup-miniconda@v2
Gathering Inputs...
Creating bootstrap condarc file in /home/runner/.condarc...
Ensuring installer...
Setup environment variables...
Parsing environment...
Configuring conda package cache...
Applying initial configuration...
Initializing conda shell integration...
Adding tools to 'base' env...
Ensuring environment...
Updating 'pcmdi_metrics_dev' env from conda env update...
  /usr/share/miniconda/condabin/conda env update --name pcmdi_metrics_dev --file conda-env/dev.yml
  Collecting package metadata (repodata.json): ...working... done
  Solving environment: ...working... failed
  Warning: 
  ResolvePackageNotFound: 
    - xcdat=0.5.0
  
  
  
  ResolvePackageNotFound: 
    - xcdat=0.5.0
  

Error: The process '/usr/share/miniconda/condabin/conda' failed with exit code 1

- Add step to install local build of pcmdi_metrics in GH Actions conda env
@tomvothecoder
Copy link
Collaborator

@tomvothecoder As discussed at the xcdat meeting, CI/CD build test is failing and with the log that complains it cannot find xcdat 0.5.0. Could you please take a look?

Set up Conda Environment

Run conda-incubator/setup-miniconda@v2
Gathering Inputs...
Creating bootstrap condarc file in /home/runner/.condarc...
Ensuring installer...
Setup environment variables...
Parsing environment...
Configuring conda package cache...
Applying initial configuration...
Initializing conda shell integration...
Adding tools to 'base' env...
Ensuring environment...
Updating 'pcmdi_metrics_dev' env from conda env update...
  /usr/share/miniconda/condabin/conda env update --name pcmdi_metrics_dev --file conda-env/dev.yml
  Collecting package metadata (repodata.json): ...working... done
  Solving environment: ...working... failed
  Warning: 
  ResolvePackageNotFound: 
    - xcdat=0.5.0
  
  
  
  ResolvePackageNotFound: 
    - xcdat=0.5.0
  

Error: The process '/usr/share/miniconda/condabin/conda' failed with exit code 1

Hi @lee1043 I'm looking at this issue right now. Ignore my many commits for trying to fix this issue.

I think the root cause might be related to the GH Actions conda environment using a cached version of the packages.

- Update build workflow dependencies
- Update build workflow conda caching mechanism based on date and environment file
@tomvothecoder tomvothecoder force-pushed the 924_lee1043_xcdat_open_multiple_files branch from e5ad015 to b86fc64 Compare April 26, 2023 22:15
@acordonez
Copy link
Collaborator

@lee1043 This fails for me using the demo dataset, where there is a single input file per model. Here's the error text, I can send the full log if that's useful:

--- prepare mean climate metrics calculation ---
--- start mean climate metrics calculation ---
varname: rlut
level: None
reference_data_set (all):  ['alternate1', 'default']
ref: alternate1
ref_data_full_path: demo_data/pmpobs_v1.0/rlut/CERES-EBAF-4-0/v20210804/rlut_mon_CERES-EBAF-4-0_PCMDI_gn.200301-201812.AC.v20210804.nc
Traceback (most recent call last):
  File "/home/ordonez4/miniconda3/envs/pcmdi_metrics_dev/bin/mean_climate_driver.py", line 183, in <module>
    ds_ref = load_and_regrid(data_path=ref_data_full_path, varname=varname, level=level, t_grid=t_grid, decode_times=False, regrid_tool=regrid_tool, debug=debug)
  File "/home/ordonez4/miniconda3/envs/pcmdi_metrics_dev/lib/python3.9/site-packages/pcmdi_metrics/mean_climate/lib/load_and_regrid.py", line 25, in load_and_regrid
    ds = xc.open_mfdataset(data_path, data_var=varname_in_file, decode_times=decode_times)  # NOTE: decode_times=False will be removed once obs4MIP written using xcdat
  File "/home/ordonez4/miniconda3/envs/pcmdi_metrics_dev/lib/python3.9/site-packages/xcdat/dataset.py", line 205, in open_mfdataset
    ds = xr.open_mfdataset(
  File "/home/ordonez4/miniconda3/envs/pcmdi_metrics_dev/lib/python3.9/site-packages/xarray/backends/api.py", line 947, in open_mfdataset
    raise OSError("no files to open")
OSError: no files to open

Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GH Actions build is now passing. I updated the CI/CD conda environment to correctly update the cache instead of referencing old package versions.

Summary of Changes:

build_workflow.yml

  • Bumped dependency versions
  • Updated conda caching to cache entire environment and refresh cache every 24 hours
  • Update conda environment build step to use mamba for speed
  • Add "Install pcmdi_metrics" step to install local branch version of the package to run tests against

dev.yml

  • Bump dependencies versions -- these should be updated periodically (recommended before a new software version release)

@lee1043
Copy link
Contributor Author

lee1043 commented Apr 27, 2023

@lee1043 This fails for me using the demo dataset, where there is a single input file per model. Here's the error text, I can send the full log if that's useful:

--- prepare mean climate metrics calculation ---
--- start mean climate metrics calculation ---
varname: rlut
level: None
reference_data_set (all):  ['alternate1', 'default']
ref: alternate1
ref_data_full_path: demo_data/pmpobs_v1.0/rlut/CERES-EBAF-4-0/v20210804/rlut_mon_CERES-EBAF-4-0_PCMDI_gn.200301-201812.AC.v20210804.nc
Traceback (most recent call last):
  File "/home/ordonez4/miniconda3/envs/pcmdi_metrics_dev/bin/mean_climate_driver.py", line 183, in <module>
    ds_ref = load_and_regrid(data_path=ref_data_full_path, varname=varname, level=level, t_grid=t_grid, decode_times=False, regrid_tool=regrid_tool, debug=debug)
  File "/home/ordonez4/miniconda3/envs/pcmdi_metrics_dev/lib/python3.9/site-packages/pcmdi_metrics/mean_climate/lib/load_and_regrid.py", line 25, in load_and_regrid
    ds = xc.open_mfdataset(data_path, data_var=varname_in_file, decode_times=decode_times)  # NOTE: decode_times=False will be removed once obs4MIP written using xcdat
  File "/home/ordonez4/miniconda3/envs/pcmdi_metrics_dev/lib/python3.9/site-packages/xcdat/dataset.py", line 205, in open_mfdataset
    ds = xr.open_mfdataset(
  File "/home/ordonez4/miniconda3/envs/pcmdi_metrics_dev/lib/python3.9/site-packages/xarray/backends/api.py", line 947, in open_mfdataset
    raise OSError("no files to open")
OSError: no files to open

@acordonez thank you for checking the demo. Can you please confirm you have the input file, demo_data/pmpobs_v1.0/rlut/CERES-EBAF-4-0/v20210804/rlut_mon_CERES-EBAF-4-0_PCMDI_gn.200301-201812.AC.v20210804.nc, otherwise you will need to run demo 0 to secure the data. May I have the full log as well?

@lee1043
Copy link
Contributor Author

lee1043 commented Apr 27, 2023

@tomvothecoder Thank you very much for making the build test working! That is a huge help!

@acordonez
Copy link
Collaborator

@lee1043 you're right, once I updated the obs demo data this is working now. I don't have any xml clims to test, but I can also run that if there is some data you can point me to.

@lee1043
Copy link
Contributor Author

lee1043 commented Apr 27, 2023

@acordonez thank you for checking. I think it is okay to merge. Merging this now.

@lee1043 lee1043 merged commit cb6f773 into main Apr 27, 2023
@lee1043 lee1043 deleted the 924_lee1043_xcdat_open_multiple_files branch April 27, 2023 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

computing climatolologies spanning multiple files
4 participants