-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelization #1
Labels
enhancement
New feature or request
Comments
Proof of concept: def foo(df):
flag = True
if df.total_dist_km < 300.:
flag = False
if flag:
if ((df.vortex_type != 0).sum() / df.shape[0] > 0.2):
flag = False
if flag:
df['cat'] = 99
return df
with concurrent.futures.ProcessPoolExecutor(4) as pool:
TR.data = pd.concat(list(pool.map(foo, [j for i, j in TR.gb], chunksize=10))) So far it doesn't give any speed-ups, but maybe it will for heavier computations. |
For track density calculation, concurrent execution reduces total time by 2. Example: with concurrent.futures.ProcessPoolExecutor(4) as pool:
res = list(pool.map(density, gb_list, chunksize=10))
dens = np.array(res).sum(axis=0) Needs to be investigated further. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add an option to execute functions in parallel processes
The text was updated successfully, but these errors were encountered: