Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.22.0 #10211

Merged
merged 2 commits into from
Jul 26, 2024
Merged

v1.22.0 #10211

merged 2 commits into from
Jul 26, 2024

Conversation

j-xiong
Copy link
Contributor

@j-xiong j-xiong commented Jul 23, 2024

For CI only. Do not merge. Will be pushed directly.

@belynam
Copy link
Contributor

belynam commented Jul 23, 2024

Hi @j-xiong ,

Looking at the release candidate downloads here, specifically libfabric-1.22.0rc2.tar.bz2, I find that prov/opx/provider_FABRIC_1.0.map is not included. This causes an error when running configure and specifying --enable-direct=opx. What would we need to do to make sure that file is included in official release tarballs?

@j-xiong
Copy link
Contributor Author

j-xiong commented Jul 23, 2024

@belynam Have you ever been able to do it before? The same issue exists for opx, psm2 and sockets. I can fix it but want to double check that it's not a regression.

@j-xiong
Copy link
Contributor Author

j-xiong commented Jul 23, 2024

@belynam I have included a fix here.

@belynam
Copy link
Contributor

belynam commented Jul 24, 2024

@belynam Have you ever been able to do it before? The same issue exists for opx, psm2 and sockets. I can fix it but want to double check that it's not a regression.

I don't think this is a regression, I see the same issue on previous release tarballs. Historically when testing, I've only downloaded the source itself or pulled down the branch to test the build. I'm not aware of any of our users running into this issue, so maybe most don't use that option.

@belynam I have included a fix here.

Thank you!

@shijin-aws
Copy link
Contributor

bot:aws:retest

1 similar comment
@shijin-aws
Copy link
Contributor

bot:aws:retest

shijin-aws and others added 2 commits July 25, 2024 09:03
…_msg

It is observed 1M caused some OOM error for some cuda allocation,
9000 should be big enough as it exceeds the shm's inject size.

Signed-off-by: Shi Jin <[email protected]>
(cherry picked from commit afaed0c)
Signed-off-by: Jianxin Xiong <[email protected]>
@j-xiong j-xiong merged commit 1592196 into ofiwg:v1.22.x Jul 26, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants