Question: Correct parallelization usage #105

kimrojas · 2023-10-10T04:01:34Z

Hi I would like to ask about how to properly run the catlearn code with proper parallelization.

I tested catlearn and compared it with traditional neb.x code of quantum espresso.
With the same system and number of images, the catlearn (single node - 64 core) and the neb.x (5 nodes - 64 core each - image parallelized) have the same duration. This means that catlearn is more efficient in resources.

I would like to further expand this by utilizing more node for the DFT calculation, say 5 nodes for 1 DFT evaluation.
When I do the catlearn (5 node - 64 core each), the calculation become rather slow.
The 5 node method is applied to the DFT calculation via ASE_ESPRESSO_COMMAND.
I think the bottleneck maybe due to the parallelization of the catlearn for 5 nodes ? (at least the automatic treatment is not correct).

May I ask how to properly do this ?

kimrojas · 2024-06-21T05:10:29Z

Current investigation shows that parallelization of catlearn is handled by the numpy stuff. OMP_NUM_THREADS=ncore will be useful here.

Closing this issue.

kimrojas closed this as completed Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Correct parallelization usage #105

Question: Correct parallelization usage #105

kimrojas commented Oct 10, 2023

kimrojas commented Jun 21, 2024

Question: Correct parallelization usage #105

Question: Correct parallelization usage #105

Comments

kimrojas commented Oct 10, 2023

kimrojas commented Jun 21, 2024