-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel concatenate #5926
base: main
Are you sure you want to change the base?
Parallel concatenate #5926
Changes from all commits
f9e0106
6b01352
35facc7
6e883da
7cc9f19
add3f66
eb0b340
24615a0
5c68a8e
2540fea
3bfea80
5870bcd
b59e3dc
e91175c
c2973ae
1633b70
8b44aa5
134669d
8a41250
861dd1b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -45,13 +45,15 @@ class Concatenate: | |
cube_list: CubeList | ||
|
||
def setup(self): | ||
source_cube = realistic_4d_w_everything() | ||
second_cube = source_cube.copy() | ||
first_dim_coord = second_cube.coord(dimensions=0, dim_coords=True) | ||
first_dim_coord.points = ( | ||
first_dim_coord.points + np.ptp(first_dim_coord.points) + 1 | ||
) | ||
self.cube_list = CubeList([source_cube, second_cube]) | ||
source_cube = realistic_4d_w_everything(lazy=True) | ||
self.cube_list = CubeList([source_cube]) | ||
for _ in range(24): | ||
Comment on lines
+49
to
+50
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The concatenate benchmarks now concatenate 25 cubes, to make the benchmark more realistic but not too time-consuming. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried this branch with one year of high resolution data ( 12 cubes to concatenate) and I get a decrease in time of about 90 seconds with respect to main branch 👍 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for testing @sloosvel! Do you know what the total time was as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I get around one minute for the parallel concatenation and two minutes / two minutes and a half for the main branch |
||
next_cube = self.cube_list[-1].copy() | ||
first_dim_coord = next_cube.coord(dimensions=0, dim_coords=True) | ||
first_dim_coord.points = ( | ||
first_dim_coord.points + np.ptp(first_dim_coord.points) + 1 | ||
) | ||
self.cube_list.append(next_cube) | ||
|
||
def time_concatenate(self): | ||
_ = self.cube_list.concatenate_cube() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enabled
lazy=True
because it seems more realistic, but the timings are very similar for both cases, maybe because the data is small enough to keep in RAM. Is this change wanted or should I change it back?