Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Train] Llama 2 workspace template release tests #37871

Merged
merged 51 commits into from
Jul 28, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
3da40e9
[Train] LLM fine-tuning workspace template fix custom resources (#37745)
kouroshHakha Jul 25, 2023
c259150
added llama-2 70b scripts
kouroshHakha Jul 26, 2023
cf999c4
wip
kouroshHakha Jul 26, 2023
4d39318
Merge branch 'master' of github.com:ray-project/ray into llama-70b-ft
kouroshHakha Jul 26, 2023
dc25e03
added release tests for 7 and 13B
kouroshHakha Jul 26, 2023
16ae886
updated README
kouroshHakha Jul 26, 2023
12f4b37
updated readme
kouroshHakha Jul 26, 2023
16be8c1
wip
kouroshHakha Jul 26, 2023
d88f97a
updated scripts
kouroshHakha Jul 26, 2023
7485d3f
wip
kouroshHakha Jul 26, 2023
055b065
wip
kouroshHakha Jul 26, 2023
88cac09
better lamma-7b settings
kouroshHakha Jul 26, 2023
32def4f
1. Fix readme typo, 2. fixed evaluation
kouroshHakha Jul 26, 2023
485168c
fixed typo in release tests
kouroshHakha Jul 26, 2023
9744a13
update readme
kouroshHakha Jul 26, 2023
4055100
updating cluster_end
kouroshHakha Jul 26, 2023
cc709dc
fixing release test
kouroshHakha Jul 27, 2023
159241b
temp changing concurrency group
kouroshHakha Jul 27, 2023
db52ef6
test the shell changes
kouroshHakha Jul 27, 2023
53d5514
added other shells
kouroshHakha Jul 27, 2023
05b0a0f
reverting activating release tests
kouroshHakha Jul 27, 2023
57f402c
reverting concurrency
kouroshHakha Jul 27, 2023
4b46723
lint
kouroshHakha Jul 27, 2023
c6cef39
updated docker
kouroshHakha Jul 27, 2023
6929bb8
reverting the random stuff
kouroshHakha Jul 27, 2023
aa673cc
lint
kouroshHakha Jul 27, 2023
45b59a7
update the shell to one
kouroshHakha Jul 27, 2023
f61b019
code format
kouroshHakha Jul 27, 2023
b30c4e4
format
kouroshHakha Jul 27, 2023
e7d6c0e
Revert "reverting activating release tests"
kouroshHakha Jul 27, 2023
f8330d2
Revert "reverting the random stuff"
kouroshHakha Jul 27, 2023
25c7db9
Revert "reverting concurrency"
kouroshHakha Jul 27, 2023
270facd
moved the testing cluster env
kouroshHakha Jul 27, 2023
04c974d
removed cloud ids from the compute configs
kouroshHakha Jul 27, 2023
e0481e5
added testing compute configs that include cloud_ids
kouroshHakha Jul 27, 2023
49f2477
compute configs repointed
kouroshHakha Jul 27, 2023
1dacda9
Merge branch 'master' into llama-2-release-test
kouroshHakha Jul 27, 2023
35f76ba
white space removal
kouroshHakha Jul 27, 2023
59b927a
testing the path stuff
kouroshHakha Jul 27, 2023
558213e
byod switching
kouroshHakha Jul 27, 2023
ae2d09e
updated the compiled byod stuff
kouroshHakha Jul 27, 2023
a7192bb
wip
kouroshHakha Jul 28, 2023
54c70c9
wip
kouroshHakha Jul 28, 2023
5702c96
wip
kouroshHakha Jul 28, 2023
6100736
lint
kouroshHakha Jul 28, 2023
6f1a5e6
wip
kouroshHakha Jul 28, 2023
b61a352
wip
kouroshHakha Jul 28, 2023
49f0b13
wip
kouroshHakha Jul 28, 2023
2035342
reverting concurrency
kouroshHakha Jul 28, 2023
123b77e
1. cu117->cu118 2. team: train->ml
kouroshHakha Jul 28, 2023
fc680c9
lint
kouroshHakha Jul 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
[Train] LLM fine-tuning workspace template fix custom resources (#37745)
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
  • Loading branch information
kouroshHakha committed Jul 26, 2023
commit 3da40e9a736b1e0387331214d3f9b5e58f4c0fc2
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@ head_node_type:
name: head_node_type
instance_type: g5.16xlarge
resources:
large_cpu_mem: 1
custom_resources:
large_cpu_mem: 1

worker_node_types:
- name: gpu_worker
Expand All @@ -12,4 +13,5 @@ worker_node_types:
max_workers: 15
use_spot: false
resources:
medium_cpu_mem: 1
custom_resources:
medium_cpu_mem: 1
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@ head_node_type:
name: head_node_type
instance_type: n1-highmem-64-nvidia-k80-12gb-1
resources:
large_cpu_mem: 1
custom_resources:
large_cpu_mem: 1

worker_node_types:
- name: gpu_worker
Expand All @@ -11,4 +12,5 @@ worker_node_types:
max_workers: 15
use_spot: false
resources:
medium_cpu_mem: 1
custom_resources:
medium_cpu_mem: 1