Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to automatically upload logs to Vertex Tensorboard #570

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

SurbhiJainUSC
Copy link
Collaborator

@SurbhiJainUSC SurbhiJainUSC commented Mar 29, 2024

  • Add functionality to upload data in config.tensorboard_dir to Tensorboard in Vertex AI
  • XPK users won't have to create a Tensorboard in Vertex AI. XPK will automatically handle that.
  • For non-XPK cases, users can choose to manually create Tensorboard instance in Vertex AI cloud console or set the configuration to create instance in MaxText

Note: Uploader to upload logs to Vertex Tensorboard is only supported for Tensorflow < 2.15.0. Vertex AI team is working on fixing the issue to support the latest TF versions.

Copy link
Collaborator

@rwitten rwitten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of questions! But looks amazing!

MaxText/configs/base.yml Outdated Show resolved Hide resolved
MaxText/configs/base.yml Show resolved Hide resolved
MaxText/configs/base.yml Outdated Show resolved Hide resolved
MaxText/configs/base.yml Outdated Show resolved Hide resolved
MaxText/configs/base.yml Outdated Show resolved Hide resolved
MaxText/configs/base.yml Outdated Show resolved Hide resolved
MaxText/train.py Outdated Show resolved Hide resolved
MaxText/train.py Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@rwitten rwitten removed their assignment Mar 30, 2024
@SurbhiJainUSC
Copy link
Collaborator Author

Lots of questions! But looks amazing!

Thank you Rafi for your suggestions!
I have simplified the process of creating Vertex AI Tensorboard for testing MaxText on GCE. I have also updated the README to explain different scenarios to upload logs to Vertex Tensorboard. Let me know if this looks good.

Copy link
Collaborator

@rwitten rwitten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit teardown is still a bit weird but looks good otherwise!

MaxText/configs/base.yml Show resolved Hide resolved
MaxText/train.py Outdated Show resolved Hide resolved
@rwitten rwitten removed their assignment Apr 3, 2024
@SurbhiJainUSC SurbhiJainUSC assigned rwitten and unassigned rwitten Apr 8, 2024
Copy link
Collaborator

@rwitten rwitten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved (but there are some nits I want you to cleanup)

MaxText/configs/base.yml Show resolved Hide resolved
MaxText/train.py Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@copybara-service copybara-service bot merged commit e366da6 into main Apr 10, 2024
7 of 8 checks passed
@copybara-service copybara-service bot deleted the vertex_tb branch April 10, 2024 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants