Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List Go version 1.18 in the docs. #1134

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/blogs/stabilize_llm_training_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,8 @@ job master 会在其日志中展示具体哪个节点的那个进程失败了,

1. 在 Kubernetes 集群上部署 DLRover ElasticJob CRD。

GO 版本: GO 1.18.

```python
git clone [email protected]:intelligent-machine-learning/dlrover.git
cd dlrover/dlrover/go/operator/
Expand Down
2 changes: 1 addition & 1 deletion docs/developer_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ make install
make run
```

- Deploy the controller.
- Deploy the controller with GO 1.18.

```bash
make deploy IMG=easydl/elasticjob-controller:master
Expand Down
2 changes: 1 addition & 1 deletion docs/tutorial/check_node_health.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ ElasticJob CRD on the cluster by the following steps.
```bash
git clone [email protected]:intelligent-machine-learning/dlrover.git
cd dlrover/dlrover/go/operator/
make deploy IMG=easydl/elasticjob-controller:master
make deploy IMG=easydl/elasticjob-controller:master # GO 1.18.
# Grant permission for the DLRover master to Access CRDs.
kubectl -n dlrover apply -f config/manifests/bases/default-role.yaml
```
Expand Down
3 changes: 2 additions & 1 deletion docs/tutorial/torch_elasticjob_on_k8s.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ on a public cloud, namely, Alibaba Cloud Container Service for Kubernetes(ACK).

## Preliminary

- Install GO 1.18.
- Create a Kubernetes cluster on [ACK](https://help.aliyun.com/document_detail/309552.htm?spm=a2c4g.11186623.0.0.168f6b7aegH7nI#task-2112671).
- Configure cluster credentials on your local computer.
- Create a [NAS](https://help.aliyun.com/document_detail/477380.html?spm=a2c4g.11186623.0.0.10635c83Xn7Tkh)
Expand All @@ -25,7 +26,7 @@ git clone [email protected]:intelligent-machine-learning/dlrover.git

```bash
cd dlrover/dlrover/go/operator/
make deploy IMG=easydl/elasticjob-controller:master
make deploy IMG=easydl/elasticjob-controller:master # GO 1.18
```

3. Grant permission for the DLRover master to Access CRDs.
Expand Down
Loading