Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validation skipping to @validate_quantities #1089

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

rocco8773
Copy link
Member

Unit and value validations (by @validate_quantities) can really slow down the performance of the decorated formulary functionality. This is not a big issue when a formulary item is being calculated once, but can be a huge detriment when the calculation needs to be made numerous times. This PR adds the validator setting keyword allow_skipping to validate_quantities (and ValidateQuantities) which then injects the skip_validations keyword into the decorated functions argument list. The doctorate function's docstring is appended to reflect this addition.

As test, thermal_speed is decorated such that skipping is allowed. This does not change the base behavior and is implemented like...

@check_relativistic
@validate_quantities(
    T={"can_be_negative": False, "equivalencies": u.temperature_energy()},
    mass={"can_be_negative": False, "can_be_nan": True},
    allow_skipping=True,
)
@particles.particle_input
def thermal_speed(
    T: u.K,
    particle: Particle,
    method="most_probable",
    mass: u.kg = np.nan * u.kg,
    ndim=3,
) -> u.m / u.s:
    ...

The way thermal_speed is calculated is also modified such that the returned value always has units of u.m / u.s, regardless of if the input arguments have (correct) units or no units. I decided to do this since: (1) when skip_validations==False we are guaranteed input arguments are in SI and nothing will be affected and (2) for performance inputs arguments may be unitless when skip_validations==True and without this modification @check_relativistic would be grumpy.

At least for thermal_speed, this appears to reduce execution time to about 40% of the original execution time. Here is an example of how the skip_validations keyword works and timing comparisons.

image

@rocco8773 rocco8773 self-assigned this Mar 27, 2021
@rocco8773 rocco8773 added the plasmapy.utils Related to the plasmapy.utils subpackage label Mar 27, 2021
@github-actions github-actions bot added the plasmapy.formulary Related to the plasmapy.formulary subpackage label Mar 27, 2021
@rocco8773 rocco8773 added the refactoring ♻️ Improving an implementation without adding new functionality label Mar 27, 2021
@rocco8773 rocco8773 added this to In Progress | Draft in Formulary Development via automation Mar 27, 2021
@codecov
Copy link

codecov bot commented Mar 27, 2021

Codecov Report

Merging #1089 (eb3f480) into master (5dd2547) will decrease coverage by 0.08%.
The diff coverage is 85.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1089      +/-   ##
==========================================
- Coverage   96.91%   96.82%   -0.09%     
==========================================
  Files          70       70              
  Lines        6928     6961      +33     
==========================================
+ Hits         6714     6740      +26     
- Misses        214      221       +7     
Impacted Files Coverage Δ
plasmapy/formulary/parameters.py 98.23% <83.33%> (-0.85%) ⬇️
plasmapy/utils/decorators/validators.py 95.79% <84.84%> (-4.21%) ⬇️
plasmapy/utils/decorators/converter.py 96.42% <88.88%> (+0.42%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5dd2547...eb3f480. Read the comment docs.

@pheuer
Copy link
Member

pheuer commented Mar 27, 2021

@rocco8773 Thank you for trying this! It is good to get some concrete numbers on how much this could improve performance. However, I am still concerned that this current implementation doesn't go far enough. I ran my own execution time tests and replicated your findings: thermal_speed with validations takes ~ 1 ms, while with the new skip_validations flag that's down to 400 us. Using np.float64 input and the mass keyword, I got that down to 240 us.

The problem is that the simplest possible version of this function executes in only 1 us!

import numpy as np
from astropy.constants import k_B
k_B = k_B.si.value


_coefficients = {
    1: {"most_probable": 0, "rms": 1, "mean_magnitude": 2 / np.pi},
    2: {"most_probable": 1, "rms": 2, "mean_magnitude": np.pi / 2},
    3: {"most_probable": 2, "rms": 3, "mean_magnitude": 8 / np.pi},
}

def fast_thermal_speed(T, m, ndim, method):
    coeff = _coefficients[ndim][method]
    return np.sqrt(coeff * T / m)

The same is true for plasma_frequency and probably many of the other formulary functions. In my view, in order to be viable for actual computational work, the plasmapy.formulary functions (in their "fast mode") need to be within a factor of 2-3x times the execution time of this simplest implementation. Otherwise, how can we possibly produce codes that can compete in the software ecosystem? Poorly documented, sloppy code will always win if it's 200x faster (or 10x faster, for that matter).

Here's some line_profiler output of the new thermal_speed:

Timer unit: 1e-07 s
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   668      1000      44700.0     44.7      2.3      m = mass if np.isfinite(mass) else particles.particle_mass(particle)
   669                                           
   670                                               # different methods, as per https://en.wikipedia.org/wiki/Thermal_velocity
   671      1000       4999.0      5.0      0.3      try:
   672      1000       5941.0      5.9      0.3          coef = _coefficients[ndim]
   673                                               except KeyError:
   674                                                   raise ValueError(
   675                                                       f"Got unsupported value for argument 'ndim' ({ndim}), expect one of "
   676                                                       f"{list(_coefficients.keys())}."
   677                                                   )
   678      1000       3948.0      3.9      0.2      try:
   679      1000       5550.0      5.5      0.3          coef = coef[method]
   680                                               except KeyError:
   681                                                   raise ValueError(
   682                                                       f"Got unsupported value for argument 'method' ({method}), expected one "
   683                                                       f"of {list(coef.keys())}."
   684                                                   )
   685                                           
   686      1000    1045985.0   1046.0     54.1      speed = (k_B * T / m).value
   687      1000     813817.0    813.8     42.1      speed = np.sqrt(coef * speed) * u.m / u.s
   688                                           
   689      1000       6783.0      6.8      0.4      return speed

In this case I was supplying dimensionless values (np.float64) and using the mass and skip_validations keywords.

The biggest issue is with the math with u.Quantity objects at the end. I would suggest the following improvements to the decorator(s) and formulary functions:

  • No u.Quantity or astropy.units should appear in the formulary functions. If u.Quantity objects are provided, one of the decorators (maybe validate_quantities) should convert them to SI then pass in only dimensionless np.ndarray arguments. Likewise, the units should be applied to the output by the decorator only in the "slow mode". If arguments without units are provided they should go right through with little to no overhead.

  • It's slightly awkward that particle is still a required argument (and therefore @particles.particle_input gets run) when I've supplied a mass so the particle is entirely ignored. Could a clever decorator be used to take particle inputs and then pass a NamedTuple of properties (in SI units) to the underlying function like @namurphy proposed?

Incidentally, while working on this I found a package for neatly running line_profiler in Spyder, and it works great! Paging @StanczakDominik .

@StanczakDominik
Copy link
Member

StanczakDominik commented Mar 27, 2021

Peter, I agree with your point. the sky's (or rather the microsecond range) is definitely the limit here for optimization, but I still think making this PR a small incremental improvement is the best way to go. It's a speed-up we can take; it'd be silly not to do it.

The more I think of your idea (outsourcing all unit IO checks to the skippable decorator), the more I'm starting to like it. So I'll cut your comment on that into a separate issue, if that's ok :)

I think your second point is essentially #918, btw!

And damn, spyder-line-profiler looks great! Makes me wish I wasn't living the terminal life... just a bit :D

Copy link
Member

@StanczakDominik StanczakDominik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good changes in the right direction! They definitely still need tests, but I like the implementation.

raise ValueError(
fsig = inspect.signature(fn)
if "to_hz" in fsig.parameters:
raise TypeError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to codecov, it doesn't look like this case is actually being checked anywhere. Do we know it doesn't actually clash with to_hz?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized this when I was updating ValidateQuantities, so I'll add the appropriate tests to angular_freq_to_hz as well as to ValidateQuantities.

)

speed = (k_B * T / m).value
speed = np.sqrt(coef * speed) * u.m / u.s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can squeeze out a tiny bit more performance out of this by preallocating the unit somewhere in the module:

[ins] In [1]: %timeit 1 * u.m / u.s
17.9 µs ± 89 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

[ins] In [2]: unit = u.m / u.s

[ins] In [3]: %timeit 1 * unit
5.69 µs ± 126 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

[ins] In [4]: %timeit u.Quantity(1, unit)
3.99 µs ± 94.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

But with #1090 in the works we can skip that if you'd prefer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will work since you still need to set unit = u.m/u.s within thermal_speed and that is not being timed in your last two cases. So, it looks faster since you're not timing something that still need to be apart of thermal_speed.

A work around, we could create a utility package, e.g. plasmapy.utils.unit_defaults, where we define all of our default units. For example,

>>> import plasmapy.utils.unit_defaults as ud
>>> ud.speed == u.m / u.s
True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @StanczakDominik was suggesting that you could define unit outside of the function (the way some of the constants are defined)? Then it would be evaluated once when the module is imported?

Comment on lines +153 to +155
(Default `False`) When setting `True` NO value or unit validations will
be performed on input arguments or returned objects by
`~plasmapy.utils.decorators.validators.validate_quantities`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(Default `False`) When setting `True` NO value or unit validations will
be performed on input arguments or returned objects by
`~plasmapy.utils.decorators.validators.validate_quantities`.
(Default `False`) When setting `True` NO value or unit validations will
be performed on the decorated function's input arguments or returned objects by
the `~plasmapy.utils.decorators.validators.validate_quantities` decorator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm avoiding the use of "function" because the decorator works for more then functions, i.e. methods too.

plasmapy/utils/decorators/validators.py Show resolved Hide resolved
plasmapy/utils/decorators/validators.py Show resolved Hide resolved
@pheuer
Copy link
Member

pheuer commented Mar 27, 2021

@StanczakDominik I absolutely agree - this is a good first step in the right direction!

Copy link
Member

@pheuer pheuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through the addition and it looked good! I agree that the main thing remaining is just tests.

)
bound_args.arguments[arg_name] = arg
if self.allow_skipping and skip_validations:
validations = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously needs a test

@rocco8773
Copy link
Member Author

The same is true for plasma_frequency and probably many of the other formulary functions. In my view, in order to be viable for actual computational work, the plasmapy.formulary functions (in their "fast mode") need to be within a factor of 2-3x times the execution time of this simplest implementation. Otherwise, how can we possibly produce codes that can compete in the software ecosystem? Poorly documented, sloppy code will always win if it's 200x faster (or 10x faster, for that matter).

You have to remember the formulary was not originally written with large scale computation in mind, it was written to be a formulary. For convenience and explicitly it was decided to incorporate units and take the time penalty. Right now is really the first time the formulary is being used as a computation tool, which it originally wasn't designed to be. I absolutely agree we should put energy into speeding up the functionality, but there will always be a trade-off between the convenience and explicit of units with the speed for computation. It looks like the formulary is heading the way of being dual purpose so there will always be a trade-off.

  • No u.Quantity or astropy.units should appear in the formulary functions. If u.Quantity objects are provided, one of the decorators (maybe validate_quantities) should convert them to SI then pass in only dimensionless np.ndarray arguments. Likewise, the units should be applied to the output by the decorator only in the "slow mode". If arguments without units are provided they should go right through with little to no overhead.

I fundamentally like the idea of having the @validate_quantities handle all the units, but this won't work all the time. Some functions allow multiple types of units and perform a little differently depending on those units, which couldn't happen if you always strip the units. Some calculations take ratios of input arguments before converting to dimensionless units to reduce computational errors. So, the solution will need to be a little more than just let @validate_quantities do everything.

  • It's slightly awkward that particle is still a required argument (and therefore @particles.particle_input gets run) when I've supplied a mass so the particle is entirely ignored. Could a clever decorator be used to take particle inputs and then pass a NamedTuple of properties (in SI units) to the underlying function like @namurphy proposed?

I agree with this and is why I proposed the @from_particle in #918, but this a non-trivial feature and is still in debate within the developers. @namurphy

I think your second point is essentially #918, btw!

@StanczakDominik Yep!

@pheuer
Copy link
Member

pheuer commented Mar 27, 2021

You have to remember the formulary was not originally written with large scale computation in mind, it was written to be a formulary.

I understand, and the initial decision to use units makes sense in that context. My point is just that, unless the time penalty is very small, it will remain impractical to use for computation. That said, this PR is definitely progress in the right direction! I just wanted to make sure we were all on the same page about the ultimate goal/requirement.

I fundamentally like the idea of having the @validate_quantities handle all the units, but this won't work all the time. Some functions allow multiple types of units and perform a little differently depending on those units, which couldn't happen if you always strip the units. Some calculations take ratios of input arguments before converting to dimensionless units to reduce computational errors.

I see the problem: that sort of situation was why I originally proposed separating each function into a fast core and a slower wrapper. That approach does allow for more customization per function. Another option would be to refactor the offending functions so that their behavior is determined explicitly by keywords instead of implicitly by argument units?

#918

So it is: I guess count this as my vote for that kind of functionality, then.

@rocco8773
Copy link
Member Author

Good news and not so good news. I was able to get the execution time down to ~20 us for thermal_speed, but this is only achievable when (1) the @check_relativistic and @particle_input decorators are removed, (2) unitless values are directly passed in for T and mass, (3) skip_validations==True is set, (4) no calculations are performed on Quantities, and (5) units are not re-applied to the returned speed.

image

Now, slowly reapplying the points I mentioned above results in the following penalties

  • Re-applying units to the returned variable has about a 25-30 us penalty.
  • @check_relativistic adds about a 75 us penalty
  • @particle_input adds about a 40 us penalty when a Particle object is passed in, even with a unitless mass
  • @particle_input adds about a 100 us penalty when a string particle (i.e. "He+") is passed in, even with a unitless mass

Removing the beneficial changes mentioned above results in...

image

So, my thoughts at the moment...

  1. Performing calculations on Quantities is significantly slower than just floats, ints, etc.
  2. Having the the @validate_quantities handle all the units is the optimal path to go. That is @validate_quantities strips all units and passes the bare arguments into the function and re-appylies units to the returned variable. By doing this skip_validations can bypass almost all time penalties associated with units.
  3. I think we can side step my previous concern about some functions needing units by still passing units to the decorated function when the @validate_quantities setting pass_equivalent_units==True is set.
  4. @check_relativistic will complain with the above changes so we should probably move it's functionality into CheckValues and consequently ValidateQuantities.
  5. After making the above changes to @validate_quantities, we'll have to revise @particle_input to get the last time benefits. This may include making something like @from_particle, adding a skip_particle setting, or something else not thought of yet. @namurphy

@StanczakDominik
Copy link
Member

StanczakDominik commented Mar 28, 2021

Oh oh oh, I just got a great idea for a way to systematically test this flag. Hypothesis. I'll explain and throw in a pr to this branch later.

@pheuer
Copy link
Member

pheuer commented Mar 28, 2021

Sounds like great progress!

@namurphy namurphy added the status: dormant PRs that are stalled label May 26, 2023
@namurphy namurphy added this to the Future milestone Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plasmapy.formulary Related to the plasmapy.formulary subpackage plasmapy.utils Related to the plasmapy.utils subpackage refactoring ♻️ Improving an implementation without adding new functionality status: dormant PRs that are stalled
Projects
Formulary Development
  
In Progress | Draft
Development

Successfully merging this pull request may close these issues.

None yet

4 participants