Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] SamTov measure memory scaling #476

Open
wants to merge 52 commits into
base: main
Choose a base branch
from

Conversation

SamTov
Copy link
Member

@SamTov SamTov commented Jan 25, 2022

Test this PR with 2 GB on Binder


Fix #464 and fix the memory safety of MDSuite which is currently not robust enough for small memory machines (< 8 GB or so) on large data sets (>10 GB).

Memory scaling measurement methodology

It seems one can use the pytest-monitoring pytest plugin to measure the memory usage of specific tests. The outputs of these measurements are stored in an sqlite database that can then be queried for information about the specific test that was run. The workflow is outlined below:

  1. Turn off memory management in a calculator
  2. Add several experiments to a project, each with a different size of data in steps [1, 10, 50, 100] MBs. In the case of species wise computations different experiments are not needed however for flux data we cannot enforce computations over different groups and therefore, several experiments will be required.
  3. Run the calculator for different databases and measure memory scaling w.r.t the dominant scaling axis
    • In the case of an atom-wise calculators this may be data range or number of atoms depending on the operation.
  4. Fit a function to the relationship and try to evaluate the scaling function exactly.

This requires the person running the experiment to install pytest-monitoring locally. It can be added theoretically to the CI on GitHub and be tested every time but we will then have to make the SQL reading more robust.

Tasks

  • Add general framework for memory testing using pytest monitor
  • Add functionality for adding a hdf5 database directly into an MDSuite experiment
  • Add functionality for turning off memory management in case of a test (all batch sizes to 1 and no minibatching)
  • Add curve fitting to the test for identify correct relationship.
  • Exclude the memory scaling test from the CI
  • Adjust GitHub runners to use 4 GB memory machine to ensure that memory safety is achieved on a reasonably memory limitd device.

Reviewer notes

Any assistance with this is welcomed as it is not the only important PR on MDSuite at the moment, namely, #475 and this still need to be fixed before any release.

@SamTov SamTov changed the title Sam tov measure memory scaling SamTov measure memory scaling Jan 25, 2022
@SamTov SamTov changed the title SamTov measure memory scaling [WIP] SamTov measure memory scaling Jan 25, 2022
@PythonFZ
Copy link
Member

I would have some questions:

Add functionality for adding a hdf5 database directly into an MDSuite experiment

What do you mean by that?

Exclude the memory scaling test from the CI

I did this for MLSuite with @pytest.mark.gap and then run pytest -m "not gap" and I think it makes sense to go with the same framework for MDSuite, but we could also name the files differently, idk.

Adjust GitHub runners to use 4 GB memory machine to ensure that memory safety is achieved on a reasonably memory limitd device.

The do have 7 GB of memory for linux / windows and I dont't think we can change that.

Comment on lines +55 to +56
memory_scaling_test: bool = False
memory_fraction: float = 0.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could expand the config to be config.memory.scaling_test = True instead of config.memory_scaling_test = True with additional dataclasses. This way it could be more structured.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's an interesting idea. Managing how configuration things are set in general is a nice thing to discuss as it can be quite involved. I think having data classes for different things like you mention here would be very nice.

@SamTov
Copy link
Member Author

SamTov commented Jan 25, 2022

I would have some questions:

Add functionality for adding a hdf5 database directly into an MDSuite experiment

What do you mean by that?

So what I want to do, and what I have started in the test module, is to generate several hdf5 databases with data of different exact sizes, e.g 1MB, 10MB and so on. In this case, rather than generate a numpy array, save it to a readable file, and then read it with MDSuite, I want to make hdf5 database, make an experiment, and then add it as data to that experiment.

In practice, this would be the equivalent of saving your simulation data into the H5MD database format and then using it in MDSuite.

Exclude the memory scaling test from the CI

I did this for MLSuite with @pytest.mark.gap and then run pytest -m "not gap" and I think it makes sense to go with the same framework for MDSuite, but we could also name the files differently, idk.
So the only factor here is whether we want to have it run each time? I think the it is unnecessary to have the scaling stuff run every time but am open to differing opinions.

Adjust GitHub runners to use 4 GB memory machine to ensure that memory safety is achieved on a reasonably memory limitd device.

The do have 7 GB of memory for linux / windows and I dont't think we can change that.

I thought somewhere you can set a memory limit but maybe I am mistaken. In that case, we can set up local runners and just keep it as a test for say, releases that we perform locally.

@SamTov SamTov mentioned this pull request Jan 27, 2022
4 tasks
@github-actions
Copy link

github-actions bot commented Feb 2, 2022

a262669

Memory Scaling

Raw data

Activate in workflow file

CML watermark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic/standardized scale function computation
2 participants