Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ashao: HW9 commit #159

Merged
merged 1 commit into from
Dec 8, 2014
Merged

ashao: HW9 commit #159

merged 1 commit into from
Dec 8, 2014

Conversation

ashao
Copy link
Contributor

@ashao ashao commented Dec 2, 2014

No description provided.

@jakevdp
Copy link
Contributor

jakevdp commented Dec 8, 2014

This is an interesting approach – using a DirectView and scattering periods to the available clusters. All the other solutions I've seen have used a LoadBalancedView and pushed each individual problem to a single core. It's not obvious to me which one would be faster... but I like this solution!

jakevdp added a commit that referenced this pull request Dec 8, 2014
@jakevdp jakevdp merged commit f20ffdd into uw-python:master Dec 8, 2014
@ashao
Copy link
Contributor Author

ashao commented Dec 10, 2014

Hi Jake,
I thought about that too and a priori I think that the LoadBalancedView way
is probably faster because there's less communication overhead (though
presumably that's negligible on a shared-memory machine). One question that
I haven't quite been able to figure out via Google is whether there's a
threaded version of Numpy/Scipy or if using this builtin iPython way of
parallelizing routines is the best way to perform operations on large
amounts of data?

Thanks!
Andrew

On Mon, Dec 8, 2014 at 9:07 AM, Jake Vanderplas [email protected]
wrote:

Merged #159 #159.


Reply to this email directly or view it on GitHub
#159 (comment).

@jakevdp
Copy link
Contributor

jakevdp commented Dec 10, 2014

Hi – there's not any general threaded version of numpy, so with IPython you're mostly stuck with doing embarrassingly parallel approaches that don't require data sharing between processes. One way you can get around this is to use memory-mapped arrays. There's a bit of info on this in Olivier Grisel's parallel machine learning tutorial: https://github.com/ogrisel/parallel_ml_tutorial

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants