Skip to content

CIME Git Workflow

Jim Edwards edited this page Feb 27, 2024 · 6 revisions

CIME Git Workflow

Introduction

This document describes the Git and GitHub workflow for developers of the CESM fork of the Common Infrastructure for Modeling the Earth (CIME). As a basis for common understanding of the CESM CIME workflow, we recommend all developers start by reading, at a minimum, chapters 1-3 of the online online Pro Git book. The first two sections of chapter 6 may also help you to get acquainted with GitHub.

Once familiar with the basic concepts of git, these commands are available for quick reference.

$ git help

$ git help <COMMAND>

This workflow assumes that you already have your own personal GitHub account, and that you are using Git 2.0 or later. To check the version of Git:

$ git --version

On some systems, you may need to load a module. On cheyenne, the command is

$ module load git

Figure 1 below graphically illustrates some of the transactions that take place when interacting with local and remote repositories. Terms referenced in the figure are also referred to throughout this document as follows:

  • upstream - ESMCI/CIME github or remote repository
  • origin - Personal fork of the upstream repository
  • local - The local repository on a particular machine

Figure 1 Figure 1


General Workflow

Configure Git (one time)

Git needs to be configured once locally on each machine for each developer. The global configuration file is stored in $HOME/.gitconfig.

Required settings:

$ git config --global user.name "Your Name"

$ git config --global user.email [email protected]

$ git config --global push.default simple

The “push.default simple” configuration ensures that only the currently checked out local branch will be pushed to the remote repository (i.e. your fork on GitHub). This setting is unnecessary for Git 2.0 or later, but it helps when using a slightly older version. If there is a problem with this setting, it may be because your version of Git is very out-of-date, and you need to find or acquire a new version on your system.

Convenient (but not required) settings:

$ git config --global color.ui true # note - true is the default

$ git config --global core.editor [editor of your choice: emacs, vi, vim, etc]

$ git config --global diff.algorithm histogram

$ git config --global merge.ff false

$ git config --global pull.ff only

Note that if you use typical *nix environment variables to set an editor (e.g. using $EDITOR), Git will pick that up automatically even if you don’t add a setting to .gitconfig.

The diff.algorithm option will generate better patches than the default.

The merge.ff and pull.ff settings are mainly important if you are integrating changes back to master. Setting merge.ff=false is equivalent to specifying --no-ff when you do a merge; this is important when merging to master in order to maintain a clean history using 'git log --first-parent'. Setting pull.ff=only prevents you from using 'git pull' if your local branch has evolved. If you truly want to merge the changes from the remote with those on your local branch, then you will need to do so with git fetch + git merge (which will generate a merge commit). But, in order to maintain a clean history using 'git log --first-parent', we never want merge commits generated when updating your local copy of master. Setting pull.ff=only ensures that this will be true, as long as you only try to update your local copy of master using 'git pull'.

Git needs to be setup once per clone to use any CIME specific commit templates and hooks.

See also: Pro Git, section 1.6

Forking the CIME GitHub repository (usually just one time)

Fork a copy on GitHub from the upstream repo by going to

https://github.com/ESMCI/cime

and clicking on the “Fork” button in the upper right corner of the repository’s main page. Choose your personal GitHub user account when creating a new fork.

See also: Pro Git section 6.2

Creating a local CIME repository

Clone the remote fork to your local machine. The GitHub fork is your “origin” and the local repository is referred to as “local repo-name”.

$ git clone https://github.com/username/cime [local-repo-name]

$ cd [local-repo-name]

If you don’t specify a local-repo-name on the command line then the default directory created is “cime”. For this document, we assume that the “local” directory is called “cime”.

See also: Pro Git, section 2.1

Create a branch in your local CIME repository

First, query your local and remote repo for available branches.

To see what branches are available locally:

$ git branch --list

To see what tags are available:

$ git tag --list

To see all branches locally and remotely:

$ git branch --list --all

* master

remotes/origin/HEAD -> origin/master

remotes/origin/jedwards/mctupdate

remotes/origin/master

To create a branch locally:

All changes should be carried out on a branch. Changes include:

  • the addition of a new feature (subcomponent/feature)
  • fixing a bug (subcomponent/bug_fix)
  • documentation
  • new tests

To create a new branch from the master and check it out:

$ git checkout master

$ git branch new-branch

$ git checkout new-branch

-- or --

$ git checkout -b new-branch master

Tools for viewing the commit logs and project history:

If you have X11 installed and X11 forwarding setup for ssh, then you can launch the built in git GUI from the command line:

$ gitk

-- or --

$ git log --oneline --first-parent

To create a new branch starting from an existing tag or branch, and check it out:

$ git branch some-old-tag-or-branch new_branch

$ git checkout new_branch

-- or --

$ git checkout -b new_branch some-old-tag-or-branch

To switch to an existing branch:

$ git checkout new-branch

[Note: It is also possible to check out a tag this way, but this leaves you in a “detached HEAD” state, where changes you make and commit can be lost unless you make a new branch for them. Unless you are sure that you will make no changes in your working directory (e.g. because you just want to look at or archive the code without running it) you should make a branch instead of checking out the tag directly!]

To delete an existing branch:

$ git checkout my-old-feature

$ git branch -d my-new-feature

See also: Pro Git, section 2.6 and Pro Git, section 3.1

Make a change and then commit that change locally

Git has a working copy (what is on the file system) and a staging area (what is actually going to be committed). They are not the same.

Here’s a workflow scenario for making changes.

First, you make changes to “filename1” and “filename2”. You check that these files are changed (and are the only changes):

$ git status

You want to stage the changes for the next commit:

$ git add filename1 filename2

$ git status

Actually, you’ve changed your mind, so you don't want the changes in “filename2” committed:

$ git reset filename2

$ git status

You decide to throw out these changes to “filename2” and restore the original version in your working copy:

$ git checkout filename2

You want to stage removal of “filename3” for this commit as well:

$ git rm filename3

To stage moving working copy file or directory to a new location:

$ git mv path/to/source destination/path

Here’s a prototype of a workflow for converting a SVN CESM external into the git local repo. https://github.com/bandre-ucar/cime-dev-tools

Finally, you can commit the staging area:

$ git commit

Note: do not use git commit -m. Always allow git to open an editor and use the commit template.

Optional step - To tag the committed change to your local:

$ git tag [tagname]

Optional step - To create a release tag for CIME, all tags are required to be annotated:

$ git tag -a [tagname]

(Ben is checking on this to be sure the -a bring up what we need)

To delete an existing tag:

$ git tag -d [tagname]

Local tag naming conventions can be whatever is most helpful to your development workflow. Production release tagging should follow the conventions laid out in CIME tag naming conventions.

Determining what has changed using diff

There are 3 different types of “snapshot” in the local repo at any given moment:

  • the working copy
  • the staging area for the next commit
  • a commit such as the HEAD of a branch, a tag, or an arbitrary commit

To see the differences between any 2 of the 3 states, use one of the following commands. To see the difference between the working copy and staging area:

$ git diff

To see the difference between the staging area and the head of the branch:

$ git diff --staged

To see how the working copy differs from the head of the branch:

$ git diff HEAD

To see the difference between the working copy and a tag (or branch):

$ git diff cimex.y.z

To view a summary of the overall state of your local repo:

$ git status

See also: Pro Git, section 2.2

Updating your local repo with the latest changes

To update your local repo with the latest changes from the CESM development repo, you should add a remote on the command line.

$ git remote add upstream https://github.com/ESMCI/cime

This adds “upstream” as an alias for the URL “https://github.com/ESMCI/cime”.

To get the latest data from the GitHub remote:

$ git fetch origin

--or--

$ git fetch upstream

If you clone a repository, the command automatically adds that remote repository under the name “origin”. So “git fetch origin” fetches any new work that has been pushed to that server since you cloned it (or last fetched from it).

It’s important to note that the git fetch command only pulls the data to your local repository – it doesn’t automatically merge it with any of your work, or modify what you’re currently working on. You have to merge it manually into your work when you’re ready.

See also: Pro Git section 2.5

Merge upstream changes into your local branch

Now that the local repo is updated with upstream information, you can merge those changes into a local branch.

$ git merge upstream/master

See Pro Git section 3.2 or the “git help merge” command for details regarding managing merging conflicts.

Push your branch back to the remote GitHub fork (“origin”)

When you push a branch back to origin, it updates the branch in your GitHub repo with all the commits you’ve made since you last pushed or pulled that branch. (For a new branch, this means all the commits made locally since you created the branch.)

To push a branch, simply use a command like this:

$ git push origin my-new-feature

You will be prompted for your GitHub username and login. You can check the GitHub website for your fork to make sure that it reflects the changes.

If you want to share a tag, you can use the same command:

$ git push origin feature-v01

If you are working on or testing the same changes on multiple machines, you may want to push a branch to your GitHub fork, then pull that same branch on a different machine and refine it there. Assuming that you’ve cloned your fork on another machine, you can fetch the branch you pushed to GitHub and make a local version of the branch like this:

$ git fetch origin

$ git branch my-new-feature origin/my-new-feature

See also: Pro Git section 3.5

Submit a pull request to merge your branch back into the upstream remote

From the GitHub web site, directly after a git push origin command, there are numerous options for submitting a pull request for your branch. You can always click on the “Pull Request” links and associated icons to issue a pull request.

See also: Pro Git section 6.2

To accept a pull request, merging changes into master

Who should do this merge?

This merge can be done by either the assignee (reviewer) of the pull request or by the original developer. In the latter case, the developer should wait for an "ok" from the reviewer.

How should this merge be done?

Option 1: From the github interface

You can perform the merge from the github interface simply by clicking on the "Merge pull request" button. This works as long as there aren't any merge conflicts (in which case the button will not be available).

Option 2: From the command line

You may want to perform the merge from the command line for a number of reasons:

  • You need to resolve conflicts
  • The reviewer wants to make some small changes before merging to master
  • You want to test the merged version before pushing it
  • You don't like the log message generated by github (which starts with "Merge pull request...")

There are a number of possible workflows for doing this merge. One that is straightforward and avoids creating extra branches in the history is the following; this assumes that you are merging changes from the branch NEWBRANCH on GITUSER's fork:

# Add developer's fork as a remote if you have not already done so
git remote add GITUSER https://github.com/GITUSER/cime.git

# Fetch all changes from GITUSER's remote
git fetch GITUSER

# Make sure your local version of master is identical to upstream.
# Note: The 'reset --hard' command is potentially destructive.
# However, it should be okay in this case since your local version of master
# should never be ahead of upstream/master.
# If you'd like to be sure of this, you can run 'git status' after checking out
# master, and before running 'git reset' - if this says that your branch is ahead
# of upstream/master, then you have made commits to your local master that you
# never pushed upstream.
# In this case, you need to fix this problem before continuing.
git fetch upstream
git checkout master
git reset --hard upstream/master

# Merge new changes into master
#
# It is VERY important that you use --no-ff here, 
# so that 'git log --first-parent' works as expected.
#
# git will open a commit message in your editor;
# you should fill this in with details as noted below
git merge --no-ff GITUSER/NEWBRANCH

# Push new master to CESM-Development
git push upstream master

Occasionally, you may see a message like this:

error: failed to push some refs to '[email protected]/CESM-Development/cime'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first merge the remote changes (e.g.,
hint: 'git pull') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details. 

This can happen when you "lose a race" as integrator: Changes have been made on the remote since the time when you updated your local copy of master. Do NOT follow the suggestion of doing a 'git pull' (or a 'git merge') into your local copy of master in this case! Doing a non-fast-forward merge to update your local copy of master results in a messy history that does not summarize nicely with 'git log --first-parent'. Instead, you should discard your changes and redo the merge:

git fetch upstream
git reset --hard upstream/master
git merge --no-ff GITUSER/NEWBRANCH
git push upstream master

What form should the merge commit message take?

Merge commit messages should follow the same template as regular commits (given above). Following this template is even more important for merge commits than for regular commits, because these will be the only commits visible with a summarized log ('git log --first-parent'). Thus, be sure to document what testing was done on this topic branch, along with a summary of changes being made with this merge. Note that, even if you have configured git so that it uses the commit template by default, it will not be used for merge commits (and of course would not be used when you do the merge via the web interface), so you will need to manually copy and paste the commit template into your log message, then edit it.

To create a new tag of master

There are several ways to accomplish the main goal of creating a new CIME tag… but the prefered technique outlined below is recommended because it includes the most information in the repository itself (that is also visible on the website):

Make sure your local repository is up-to-date with latest master from CESM-Development/cime

In local repo, make a new annotated tag [probably of the form cimeX.Y.Z]

$ git tag -a <tagname>

This will bring up whatever is set as core.editor so you can write a brief commit log

Push your new tag back to ESMCI/cime

$ git push upstream <tagname>

Update the plans page for the appropriate alpha tag [for consistency, copy / paste your commit log from (3) into the Note field]

Update the test database plans pages

For CSEG (CESM Software Engineering Group) developers, the Test and Porting database (testdb) should be updated with the planning notes regarding any changes that will be (or have been) added to the CIME upstream master via pull requests, in order for those changes to be included in a CESM alpha, beta or release tag.

We will use the plans page to determine the order in which pull requests are honored. The person who completes and closes the pull request is also responsible for making the tag.

Contributors from outside of the CSEG group should enlist a CSEG sponsor for their contribution, this sponsor will be responsible for adding the planned contribution to the testdb.

Use the widget on the right side of GitHub web interface to view the Subversion URL for the repository (in this case, “https://github.com/CESM-Development/cime”). You have to add “tags/[tag-name]” to this URL to get a reference to this particular tag for use in the database. For example, the Subversion URL for CIME 1.0.0 is:

https://github.com/ESMCI/cime/tags/cime1.0.0

You cannot use the URL in your browser address line, as this is not correct for integration with Subversion where CIME is treated as a GitHub external.


Using git subtrees

Managing Externals in CIME using git subtrees

There are currently a number of external repositories managed in CIME as git subtrees. These include MCT, PIO, and genf90. A subtree is part of the repository whereas a fork is a copy of the repository. When you fork off the CIME repo, you automatically get copies of the subtrees managed under the CIME repo.

This page is for integrators who will be bringing in code or changes to code from an external repository into CIME or to contribute changes made in CIME back to the upstream external. There are two workflows related to creating the initial git subtree within CIME: Inserting in a new external into CIME as a subtree, and replacing original CIME code with an external subtree. There are two workflows for working with an established git subtree: merging in changes from an external into a CIME subtree, and pushing changes from a CIME subtree back to an external. Each workflow is outlined below followed by sections showing details for various workflow step types. Read the conditions at the top of each workflow to be sure to choose the correct one.

Some Terminology

  • external-subdir - refers to the place in the CIME tree where the external code resides (or will reside).
  • external_name - is a name you give to the external so that you can refer to the external repository. Note that external_name can be a branch name, a commit, or a tag name.
  • external_url - is the URL for the external repository.
  • external_commit - refers to a branch name (<external_url>/<branch_name>), a commit, or a tag name from the external repository.
  • external_branchname - refers to local branch created with git subtree split. In preparation for merging with an upstream repository.
  • external_mergebranch - refers to the branch on the external used for merging changes. Depending on the rules for the external, this may be master or an integration branch.

Merge changes from external into CIME subtree (typical workflow)

This workflow may be used repeatedly anytime new work is to be moved from an external into a CIME subtree. Create and switch to a branch in which to conduct the work as described in General Workflow above.

Add external git repo to as a remote in your local repository copy:

$ git remote add <external_name> <external_url>
$ git fetch --no-tags <external_name>

Merge from external git repo to local CIME:

$ git subtree pull --squash --prefix=src/externals/<external_subdir> <external_name> <external_commit>

Commit the change:

$ git commit

Test and submit a pull request as described in the General Workflow and tag the commit.

Push changes from a CIME subtree upstream to an external repo

This workflow is to be used anytime changes to a CIME subtree should be pushed to the upstream external repository. For example, consider a bug fix made in the CIME MCT code. This workflow would allow contributing that fix upstream to MCT. Note that this workflow should not be used for the case where the CIME subdirectory was not brought into CIME as a subtree. That workflow is beyond the scope of this document.

If necessary, add remote for updated external:

$ git remote

If your remote is not listed, add it with:

$ git remote add <external_name> <external_url>
$ git fetch --no-tags <external_name>

Push to the upstream external repository:

$ git subtree push --prefix src/externals/<external_subdir> <external_name> <external_mergebranch>

Follow other procedures for updating the external. These rules may differ depending on the particular external workflow.


Advanced Topics

Insert new external subtree into CIME (one-time setup)

This workflow should only be used if CIME decides to adopt code from a new external. The resulting subtree will be a new subdirectory within CIME. This will typically be done once to be followed by the 'Merge Changes' workflow below.

Create and switch to a branch in which to conduct the work.

Add external git repo to as a remote in your local repository copy:

$ git remote add <external_name> <external_url>
$ git fetch --no-tags <external_name>

Add code from external git repo to CIME:

$ git subtree add --squash --prefix=src/externals/<external_subdir> <external_name> <external_commit>

Check to make sure the subtree was added. If everything is successful, the command should have committed the change:

$ git commit

Test using stand-alone CIME driver tests. See the github CIME Development Guide wiki for details.

Submit a pull request and tag the commit.

Replace original (unmodified) CIME code with external subtree (one-time setup)

This workflow should only be used if CIME code is being replaced with code from an external. This will typically be done once to be followed by the 'Push changes' workflow above although it may also be used to bring in major external changes (e.g., upgrade to PIO 2).

Verify that the external code in the CIME repository has not been modified since the initial commit:

$ cd <external-subdir>

$ git diff --name-only <tag_name>

If there are changes (non-blank output), follow the 'Merge changes from external into CIME subtree' workflow below (unless the changes are deemed not necessary).

Create and switch to a branch in which to conduct the work.

Add external git repo as a remote in your local repository copy:

$ git remote add <external_name> <external_url>
$ git remote fetch --no-tags <external_name>

Example:

$ git remote add MCTorigin https://github.com/MCSclimate/MCT
$ git remote fetch --no-tags MCTorigin

Remove the code from your working copy:

$ git rm -r <external-subdir>
$ git commit -a -m 'Removed <external-subdir> in preparation to replace code as a subtree from <external_url>'

Do step 'Insert new external subtree' above.


FAQs

Many questions can be answered by using the following resources.

From the web:

search the Pro Git on-line reference using the Search Entire Site search box. search the Github help site by clicking on the Help link in the top menu. For CIME specific help, see this document and the Github CIME wiki.

From the command-line:

$ git help

$ git help <COMMAND>

Beware, you may need to look in both places as the command line help documentation may not be up-to-date with the web documentation and visa-versa. For example, if you want to know how to sort the list of tags from a git command line:

$ git help tag

doesn’t include the --sort option but the web documentation does and states:

“--sort= - Sort in a specific order. Supported type is "refname" (lexicographic order), "version:refname" or "v:refname" (tag names are treated as versions). The "version:refname" sort order can also be affected by the "versionsort.prereleaseSuffix" configuration variable. Prepend "-" to reverse sort order. When this option is not given, the sort order defaults to the value configured for the tag.sortvariable if it exists, or lexicographic order otherwise. See git-config[1].”

Clone this wiki locally