Gb/multi gpu #138

grantbuster · 2023-01-09T18:33:36Z

No description provided.

grantbuster · 2023-01-09T19:52:52Z

@bnb32 can you take a look at what might be going on with the data-centric test? I have no idea how my edits would affect that code

https://github.com/NREL/sup3r/actions/runs/3876667898/jobs/6610776941

bnb32

Wish we could restructure to avoid duplicating some code but overall an elegant solution to the multi-gpu problem.

bnb32 · 2023-01-09T22:10:46Z

sup3r/models/abstract.py

+ training_weights,
+ device_name=f'/gpu:{i}',
+ **calc_loss_kwargs))
+ for i, future in enumerate(futures):


why don't we need as_completed() here?

Also, applying the different gradients like this is better than applying the average of the gradients once?

we could use as_completed() but the futures will finish at basically the same time so i was just keeping it simple. And i dont think either is better - averaging with a single weight update might actually be more intuitive, but this is nice and simple.

bnb32 · 2023-01-09T22:13:00Z

sup3r/models/abstract.py

@@ -960,3 +1113,54 @@ def _tf_generate_wind(self, low_res, hi_res_topo):
 raise RuntimeError(msg) from e

 return hi_res
+
+ @tf.function()


having this copied code hurts me. but alas...

bnb32 · 2023-01-09T22:15:41Z

sup3r/models/abstract.py

+
+ return grad, loss_details
+
+ def run_gradient_descent(self, low_res, hi_res_true, training_weights,


this is a nice simple solution. well done!

bnb32 · 2023-01-09T22:26:31Z

Wish we could restructure to avoid duplicating some code but overall an elegant solution to the multi-gpu problem.

… multi_gpu flag to train entry point

…eature

Gb/multi gpu

grantbuster requested review from malihass and bnb32 January 9, 2023 18:33

bnb32 approved these changes Jan 9, 2023

View reviewed changes

grantbuster added 13 commits January 9, 2023 16:21

attempt multi gpu parallel grad descent implementation

f6b8599

fixed device name and adam doesnt have agg grads

5a9f253

successful implementation of multi gpu training

b0d71b2

added shape check back in, better docs and logic for single gpu calc

86b793e

added weight initialization method with device placement

4fce4ec

added option to set default device placement with reasonable defaults

7629f1c

added check for model weight initialization before training and added…

169ae68

… multi_gpu flag to train entry point

added special weight initializations for WindGan and SolarCC

7acaccd

bug fix on boolean conversion of list

cda2458

bug fix on windgan implementation of gradient descent with topo exo f…

653548f

…eature

refactor to allow conditional moments and wind to use multi gpu feature

5f85e06

fixed weird wind set_model_params_wind() method

156b421

resolved funky issue with DC test randomness

d6e2cd1

grantbuster force-pushed the gb/multi_gpu branch from 3680e90 to d6e2cd1 Compare January 9, 2023 23:21

grantbuster merged commit f16f3fe into main Jan 10, 2023

grantbuster deleted the gb/multi_gpu branch January 10, 2023 15:17

github-actions bot pushed a commit that referenced this pull request Jan 10, 2023

Merge pull request #138 from NREL/gb/multi_gpu

f8e9490

Gb/multi gpu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gb/multi gpu #138

Gb/multi gpu #138

grantbuster commented Jan 9, 2023

grantbuster commented Jan 9, 2023

bnb32 left a comment

bnb32 Jan 9, 2023

bnb32 Jan 9, 2023

grantbuster Jan 9, 2023

bnb32 Jan 9, 2023

bnb32 Jan 9, 2023

bnb32 commented Jan 9, 2023


		return grad, loss_details

		def run_gradient_descent(self, low_res, hi_res_true, training_weights,

Gb/multi gpu #138

Gb/multi gpu #138

Conversation

grantbuster commented Jan 9, 2023

grantbuster commented Jan 9, 2023

bnb32 left a comment

Choose a reason for hiding this comment

bnb32 Jan 9, 2023

Choose a reason for hiding this comment

bnb32 Jan 9, 2023

Choose a reason for hiding this comment

grantbuster Jan 9, 2023

Choose a reason for hiding this comment

bnb32 Jan 9, 2023

Choose a reason for hiding this comment

bnb32 Jan 9, 2023

Choose a reason for hiding this comment

bnb32 commented Jan 9, 2023