Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various documentation/logging improvements #68

Merged
merged 4 commits into from
Jun 15, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
doc improvements
  • Loading branch information
gobbleturk committed Jun 15, 2023
commit 5b28b7a7605ae0b59838fab5f58464cf730362cf
5 changes: 3 additions & 2 deletions MaxText/configs/base.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ param_scan_axis: 1
record_internal_nn_metrics: 0

# Output directory
base_output_directory: ""
# Create a GCS bucket, e.g. my-maxtext-outputs and set this to "gs:https://my-maxtext-outputs/"
base_output_directory: ""

# Parallelism
mesh_axes: ['data', 'fsdp', 'tensor']
Expand Down Expand Up @@ -76,7 +77,7 @@ ici_tensor_parallelism: 1


# Dataset
# Replace with your path given as argument in download_dataset.sh
# Replace with your path given as argument in download_dataset.sh, e.g. "gs:https://my-maxtext-dataset/"
dataset_path: ""
vocab_size: 32_768 # powers of 2 for sharding
vocab_relative_path: "vocabs" # Assumes we're allowed
Expand Down
2 changes: 2 additions & 0 deletions end_to_end/eval_assert.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ def test_checkpointing(metrics_file, target):
restored_loss = json.loads(restored.readlines()[0])[target]
# Checks that checkpoint restore was successful by comparing loss of last
# step in saved checkpoint to loss of first step in restored checkpoint
print("saved loss: ", saved_loss)
print("restored loss: ", restored_loss)
assert isclose(saved_loss, restored_loss, rel_tol=0.1)

def test_determinism(metrics_file, target):
Expand Down
2 changes: 1 addition & 1 deletion multihost_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,7 @@ def main() -> None:
f"\n{gcs_bucket_url(args.BUCKET_NAME, bucket_dir, args.PROJECT)}\n")

print("View the status of the created TPUs via: ")
print(f"gcloud compute tpus tpu-vm list --filter={args.RUN_NAME}\n")
print(f"gcloud alpha compute tpus queued-resources list --filter={args.RUN_NAME}\n")
return 0

if __name__ == '__main__':
Expand Down