Skip to content

Advanced Topics for developers

Denise Worthen edited this page Jan 19, 2021 · 3 revisions

Debug mode

To compile the coupled model in debug mode, compile using the DEBUG=Y. See tests/tests/rt.conf for the command syntax. For example,

CCPP=Y DEBUG=Y SUITES=FV3_GFS_2017_coupled,FV3_GFS_2017_satmedmf_coupled,FV3_GFS_v15p2_coupled S2S=Y

The DEBUG=Y flag will set both the ESMF debug library as well as component level debug flags. The debug versions of the Intel MPI library are included by adding -link_mpi=dbg*_ (or *-link_mpi=dbg_mt for multi-threaded applications) to the debug settings for the coupled model. This allows the wrappers use the debug versions of the MPI library.

When running in debug mode, the wall clock time will need to be adjusted. For example, to run the C96mx100 (C96 UFSAtm, 1 deg MOM6-CICE6) 6hr debug case on hera, the wall clock time should be set to 1 hour.

Changing the number of PEs for FV3

Changes are required in both model_configure:

TASKS: total number of all tasks for all components
quilting: true/false variable to use writer cores for FV3GFS
write_groups: the number of write groups for FV3GFS
write_tasks_per_group: the number of tasks per FV3GFS write group, a multiple of ntiles

and in input.nml

layout: INPES, JNPES, the layout of pets on each task in the x & y directions
ntiles: the number of tiles, typically 6

The number of FV3 tasks is then given by:

(INPES x JNPES x 6) + (write_groups x write_tasks_per_group)

The PET layout for each component then needs to be adjusted consistent with the TASKS. For the coupled model, the mediator is given the number of FV3 tasks, but without including the the write tasks. For example, if INPES x JNPES x ntiles = 3x8x6 = 144 then the mediator is given 144 tasks and FV3 will be given 144 plus the number of write tasks.

Changing the number of PEs for CICE

In CICE6, the PE tasks can be set at run time. Dave Bailey at NCAR has provided the following useful information.

The main settings in the domain_nml used to set the run-time resources are:

 block_size_x (number of grid cells per block in the x-direction)
 block_size_y (number of grid cells per block in the y-direction)
 max_blocks (number of blocks per processor maximum)
 distribution_type (how the blocks are laid out on the processors)
 processor_shape (what the approximate shape of each block looks like)

While CICE does not necessarily need to have the same number of blocks as processors, this is usually a good rule of thumb. When you get up
into higher processor counts, the 'spacecurve' or 'sectrobin' distribution_type with smaller square shaped blocks can be used. Also, as you
go up in processors it might also be useful to have OpenMP threading. In this case you would have more than one block per processor and
max_blocks would need to be increased.

More information about decomposition choices and performance can be found in the CICE documentation.

For the regresson tests, the following settings in domain_nml are used:

processor_shape   = 'slenderX2'
nprocs            = NPROC_ICE
nx_global         = NX_GLB
ny_global         = NY_GLB
block_size_x      = BLCKX
block_size_y      = BLCKY
max_blocks        = -1

Where for NPROC_ICE (the number of TASKS assinged to ice),

BLCKX=NX_GLB/(NPROC_ICE/2)
BLCKY=NY_GLB/2

and NX_GLB and NY_GLB are the domain size in the x and y directions.

Using the CMEPS mediator to understand the coupling fields (under construction)

The CMEPS mediator has the ability to write out all the coupling fields between various components at any given point in the run sequence. There are two required changes to the standard coupled model run sequence required to use this feature. First, the MED med_phases_history_write should be added at the desired point in the run sequence. Secondly, the MED_attributes need to have the history writes enabled with the settings:

history_n = 1
history_option = nsteps

When the model is executed, a series of coupler history files will be written to the RESTART directory (the ability to write to the run directory will be implemented in a future update). Within this history file, each coupled field which the mediator has at that exact point in the run sequence is written. For example, the field atmImp_field_name is the mediator field imported from the ATM while atmExp_field_name is the field that the mediator will export to the ATM.

While CMEPS uses a mesh internally, in the history files the fields to and from the OCN and ICE components are written on their native model grids (this is the purpose of the cpl_scalars). However, since for UFSAtm the tiled fields cannot be represented in a single array, they are written in the history files on the mesh (1-D) grid.

Profiling Timing Across Components

To check run times of different components for load balancing, the following two environment variables must be set:

export ESMF_RUNTIME_PROFILE=ON
export ESMF_RUNTIME_PROFILE_OUTPUT=SUMMARY

For the coupled model, the environment variables should be added to the file tests/fv3_conf/fv3_slurm.IN_<platform> where platform is Hera, Orion, etc. This will produce the ESMF_Profile.summary in the run directory which will give you timing information for the run. See the ESMF Reference Manual for more details.

The ESMF_Profile.summary can also include MPI functions to indicate how much time is spent inside communication calls. To use this feature, modify the file tests/fv3_conf/fv3_slurm.IN_<platform> to set the environment variable for the location of mpi profiling preload script, and include the script in srun command as show below.

# set location of mpi profiling preload script
export ESMF_PRELOAD=${ESMFMKFILE/esmf.mk/preload.sh}

# include preload script before forecast executable in srun command
srun --label -n @[TASKS] $ESMF_PRELOAD ./fcst.exe

See the ESMF Reference Manual for more details:

Note, your job must complete for the summary table to be written so make sure to adjust the wall clock or runtime.

Porting to a new Machine

Note: NCEPLIBS and third party libraries need to be installed on the new platform

Coupled model

An example can be seen [here] for machine stampede.intel

The following files need to be added for each machine_name and compiler option

modulefiles/<machine_name>.<compiler>/fv3
conf/configure.fv3.<machine_name>.<compiler>

To compile, cd into the tests directory, followed by

./compile.sh ../FV3 stampede.intel 'MAKE_OPT'

See this page for details.

TODO:: Add information about how this works with Cmake

Clone this wiki locally