Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v.2.4.0] Release Tracker #128436

Closed
atalman opened this issue Jun 11, 2024 · 54 comments
Closed

[v.2.4.0] Release Tracker #128436

atalman opened this issue Jun 11, 2024 · 54 comments
Labels
oncall: releng In support of CI and Release Engineering release tracker Add this label to release tracker issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone

Comments

@atalman
Copy link
Contributor

atalman commented Jun 11, 2024

We cut a release branch for the 2.4.0 release.

Our plan from this point from this point is roughly:

  • Phase 1 (until 7/1/24): work on finalizing the release branch
  • Phase 2 (after 7/1/24): perform extended integration/stability/performance testing based on Release Candidate builds.

This issue is for tracking cherry-picks to the release branch.

Cherry-Pick Criteria

Phase 1 (until 7/1/24):

Only low-risk changes may be cherry-picked from main:

  1. Fixes to regressions against the most recent minor release (e.g. 2.3.x for this release; see module: regression issue list)
  2. Critical fixes for: silent correctness, backwards compatibility, crashes, deadlocks, (large) memory leaks
  3. Critical fixes to new features introduced in the most recent minor release (e.g. 2.3.x for this release)
  4. Test/CI fixes
  5. Documentation improvements
  6. Compilation fixes or ifdefs required for different versions of the compilers or third-party libraries
  7. Release branch specific changes (e.g. change version identifiers)

Any other change requires special dispensation from the release managers (currently @atalman, @PaliC , @huydhn, @malfet). If this applies to your change please write "Special Dispensation" in the "Criteria Category:" template below and explain.

Phase 2 (after 7/1/24):

Note that changes here require us to rebuild a Release Candidate and restart extended testing (likely delaying the release). Therefore, the only accepted changes are Release-blocking critical fixes for: silent correctness, backwards compatibility, crashes, deadlocks, (large) memory leaks

Changes will likely require a discussion with the larger release team over VC or Slack.

Cherry-Pick Process

  1. Ensure your PR has landed in master. This does not apply for release-branch specific changes (see Phase 1 criteria).

  2. Create (but do not land) a PR against the release branch.

    # Find the hash of the commit you want to cherry pick
    # (for example, abcdef12345)
    git log
    
    git fetch origin release/2.4
    git checkout release/2.4
    git cherry-pick -x abcdef12345
    
    # Submit a PR based against 'release/2.4' either:
    # via the GitHub UI
    git push my-fork
    
    # via the GitHub CLI
    gh pr create --base release/2.4
  3. Make a request below with the following format:

Link to landed trunk PR (if applicable):
* 

Link to release branch PR:
* 

Criteria Category:
* 
  1. Someone from the release team will reply with approved / denied or ask for more information.
  2. If approved, someone from the release team will merge your PR once the tests pass. Do not land the release branch PR yourself.

NOTE: Our normal tools (ghstack / ghimport, etc.) do not work on the release branch.

Please note HUD Link with branch CI status and link to the HUD to be provided here.
HUD

Versions

2.4.0

@atalman atalman added this to the 2.4.0 milestone Jun 11, 2024
@atalman
Copy link
Contributor Author

atalman commented Jun 11, 2024

Link to landed trunk PR (if applicable):

  • NA

Link to release branch PR:

Criteria Category:

  • Release only changes, temp changes to build triton from pin rather then branch

@atalman merged

@atalman atalman pinned this issue Jun 11, 2024
@malfet malfet added the oncall: releng In support of CI and Release Engineering label Jun 11, 2024
@atalman
Copy link
Contributor Author

atalman commented Jun 11, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:


@atalman merged

@atalman
Copy link
Contributor Author

atalman commented Jun 12, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:


@atalman merged

@atalman
Copy link
Contributor Author

atalman commented Jun 12, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Reverted on main

@malfet merged

@atalman
Copy link
Contributor Author

atalman commented Jun 12, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Reverted on main

@malfet merged

@atalman
Copy link
Contributor Author

atalman commented Jun 12, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Reverted on main

@malfet merged

@zhuhaozhe
Copy link
Collaborator

zhuhaozhe commented Jun 13, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:


@atalman merged

@zou3519
Copy link
Contributor

zou3519 commented Jun 13, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • (1) Fixes to regressions. In 2.4, we started spamming warnings if someone used pybind'ed functions with torch.compile. There were no such warnings in 2.3. This PR adjusts the warnings to be less spammy.

@atalman merged

@soulitzer soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 13, 2024
@etaf
Copy link
Collaborator

etaf commented Jun 13, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:


@atalman merged

@zou3519
Copy link
Contributor

zou3519 commented Jun 13, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • (3) Critical fixes to new features. In 2.4 we are releasing a new torch.library.custom_op API. This PR fixes a critical bug that the API did not compose with FSDP and other distributed APIs.

@atalman merged

@chunyuan-w
Copy link
Collaborator

chunyuan-w commented Jun 14, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:


@atalman merged

@etaf
Copy link
Collaborator

etaf commented Jun 14, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:


@atalman merged

@wanchaol
Copy link
Contributor

wanchaol commented Jun 14, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical fixes for compile time regression

@atalman merged

@wanchaol
Copy link
Contributor

wanchaol commented Jun 14, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical fixes for silent correctness

@atalman merged

@drisspg drisspg unpinned this issue Jun 14, 2024
@zou3519 zou3519 pinned this issue Jun 14, 2024
@Xia-Weiwen
Copy link
Collaborator

Xia-Weiwen commented Jun 15, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:


@atalman merged

@xuhancn
Copy link
Collaborator

xuhancn commented Jun 17, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical fixes for PyTorch Windows cpp_extension. If user use cpp_extension to build extension and its code contains VEC, it will occur sleef dependency issue.

Hi @xuhancn this looks like feature work to enable inductor on PyTorch Windows, however for release 2.4 we don't support this yet.
Do you have any tests with it ?

@atalman merged. ToDo run manual validation on RC with test provided in the PR

@zou3519
Copy link
Contributor

zou3519 commented Jun 17, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • (1) Regression (from 2.3.x) and (2) Critical fixes for: silent incorrectness

@atalman merged

@lw
Copy link
Contributor

lw commented Jun 17, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Fixes to regressions against the most recent minor release (changes only logging, low risk)

@atalman merged

@clee2000
Copy link
Contributor

clee2000 commented Jun 17, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Release - needed to fix executorch release CI

@clee2000 merged

@huydhn huydhn added the release tracker Add this label to release tracker issues label Jun 17, 2024
pytorchmergebot pushed a commit that referenced this issue Jun 18, 2024
This extends the capacity of the cherry-pick bot to automatically update the tracker issue with the information.  For this to work, the tracker issue needs to be an open one with a `release tracker` label, i.e. #128436.  The version from the release branch, i.e. `release/2.4`, will be match with the title of the tracker issue, i.e. `[v.2.4.0] Release Tracker` or `[v.2.4.1] Release Tracker`

### Testing

`python cherry_pick.py --onto-branch release/2.4 --classification release --fixes "DEBUG DEBUG" --github-actor huydhn 128718`

* On the PR #128718 (comment)
* On the tracker issue #128436 (comment)

Pull Request resolved: #128924
Approved by: https://github.com/atalman
@pytorchbot
Copy link
Collaborator

pytorchbot commented Jun 27, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:
Critical fix inductor on V100


@atalman merged

@pytorchbot
Copy link
Collaborator

pytorchbot commented Jun 27, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:
Critical fix


@atalman merged

@atalman
Copy link
Contributor Author

atalman commented Jun 27, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical fix MKL dependency

@atalman merged

@pytorchbot
Copy link
Collaborator

pytorchbot commented Jun 27, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:
Critical fix, distributed


@atalman merged

@oraluben
Copy link
Contributor

oraluben commented Jun 28, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Breaks building wheels in specific cases

@oraluben please note: the PR must be landed in main in order to be considered for cherry-picking

@malfet merged

@pytorchbot
Copy link
Collaborator

pytorchbot commented Jun 28, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:
Critical - packaging


@atalman merged

@Skylion007
Copy link
Collaborator

Wondering if we can get this landined in main so we can backport it. We've encoutered some hanging on certain hardware with certain firmware versions that only occur on this specific NCCL version (fixed in NCCL versions before and after): #124014

@pytorchbot
Copy link
Collaborator

pytorchbot commented Jun 28, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:
Fixnewfeature


@atalman merged

@tinglvv
Copy link
Collaborator

tinglvv commented Jul 1, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Fix for "libgfortran.so.5" missing error for CUDA ARM wheel at runtime

@atalman merged

@wz337
Copy link
Contributor

wz337 commented Jul 2, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical fix. Need this to avoid silent correctness for a feature not ready.

@atalman merged

@pytorchbot
Copy link
Collaborator

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:
Release - Update inductor expected results after pinning numpy on torchbench

@atalman
Copy link
Contributor Author

atalman commented Jul 2, 2024

Please note, we are in:
Phase 2 (after 7/1/24): perform extended integration/stability/performance testing based on Release Candidate builds.
We do not accept new cherry-pick requests

@rec rec unpinned this issue Jul 11, 2024
@atalman atalman pinned this issue Jul 12, 2024
@ppwwyyxx
Copy link
Contributor

Can we address #130658 and #130659 ? Otherwise 2.4 users will be guaranteed to see a lot of warnings.

@fengyuan14 fengyuan14 unpinned this issue Jul 15, 2024
@atalman atalman pinned this issue Jul 15, 2024
@atalman
Copy link
Contributor Author

atalman commented Jul 15, 2024

hi @ppwwyyxx both of the PRs #130658 and #130659 are targeted for release 2.4.1

@XuehaiPan
Copy link
Collaborator

XuehaiPan commented Jul 20, 2024

Link to landed trunk PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical fixes for dependency update

@XuehaiPan This pr broke linux nightly binaries, please see here: https://hud2.pytorch.org/hud/pytorch/pytorch/nightly/1?per_page=50&name_filter=linux- . Conda builds will require forward fix. Here is the failure: https://github.com/pytorch/pytorch/actions/runs/10036618056/job/27749407507#step:17:663 . Looks like sympy 1.13 is not available on conda: https://anaconda.org/anaconda/sympy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
oncall: releng In support of CI and Release Engineering release tracker Add this label to release tracker issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Development

No branches or pull requests