Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy 2.0 #689

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft

Numpy 2.0 #689

wants to merge 12 commits into from

Conversation

Armavica
Copy link
Member

@Armavica Armavica commented Apr 3, 2024

Description

Ruff suggested a few things to change but I cannot test because I install both numpy=2.0.0.rc1 and numba (any version including the most recent 0.59.1) (incompatibilities in python abi).

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

@ricardoV94 ricardoV94 force-pushed the numpy2 branch 3 times, most recently from 28bb96a to 2808634 Compare April 5, 2024 10:50
@ricardoV94
Copy link
Member

ricardoV94 commented Apr 5, 2024

Removed the custom Complex C types... I have no context as to why they were necessary, so that definitely needs careful review.

This is code that seems to have been there mostly from the beginning.

Edit seems to be needed to define operations like * :(

@ricardoV94 ricardoV94 added major NumPy compatibility dependencies Pull requests that update a dependency file C-backend labels Apr 5, 2024
@ricardoV94 ricardoV94 force-pushed the numpy2 branch 2 times, most recently from d83b37d to eca375b Compare April 5, 2024 11:10
@Armavica
Copy link
Member Author

Armavica commented Apr 6, 2024

Edit seems to be needed to define operations like * :(

In what context? If it is in array indexing, then this syntax was only added on Python 3.11 (I found out recently about this).

@ricardoV94
Copy link
Member

ricardoV94 commented Apr 6, 2024

No, this is C(++) code, multiplication of complex scalars.

Maybe @michaelosthege finds this interesting xD?

These are the breaking changes: https://numpy.org/devdocs/numpy_2_0_migration_guide.html#complex-types-underlying-type-changes

maresb added a commit to maresb/pytensor that referenced this pull request Apr 6, 2024
This will help with publishing a repodata patch for conda-forge as per
<pymc-devs#688 (comment)>.
This will no longer be necessary when we are finished with
<pymc-devs#689>.
@maresb maresb mentioned this pull request Apr 6, 2024
11 tasks
ricardoV94 pushed a commit that referenced this pull request Apr 6, 2024
This will help with publishing a repodata patch for conda-forge as per
<#688 (comment)>.
This will no longer be necessary when we are finished with
<#689>.
@maresb
Copy link
Contributor

maresb commented Apr 6, 2024

Friendly note that it will be necessary to revert 1235d08 on the next rebase in order to unpin numpy<2.

@ricardoV94
Copy link
Member

ricardoV94 commented May 30, 2024

Now with numpy 2.0 released it would be nice to be compatible with it.

There's some esoteric C++ code for complex variables that I didn't want to dig too much into

@brendan-m-murphy
Copy link

Now with numpy 2.0 released it would be nice to be compatible with it.

There's some esoteric C++ code for complex variables that I didn't want to dig too much into

Is this PR stalled? The numpy release notes seem to suggest what changes need to be made (replace x.real and x.imag with calls to some new functions in the numpy C-API). I can try to implement those changes.

@ricardoV94
Copy link
Member

ricardoV94 commented Jul 1, 2024

There's something a bit more complex. We are using c++ to overload operators of "scalar" complex numbers. This goes beyond the standard use of numpy and their advice is not sufficient here.

@brendan-m-murphy
Copy link

Okay, it's possible I'm misunderstanding. I thought that code is defining a struct (e.g. pytensor_complex64) that inherits from npy_cdouble (which is an alias for np_complex64) etc., and it looks like the numpy migration notes say that for npy_cdouble you should no longer access the real and imaginary parts as fields, but instead use these functions npy_creal and npy_cimag. So all uses of .real etc. in the overloaded operators (and possibly some other methods) would need to be changed.

It does seem a bit annoying in that npy_creal needs to be replaced with npy_creall for pytensor_complex128

I'm not certain I can fix it, but I'd be interested in trying. (Although if the above is completely off base, please let me know... in that case, I certainly don't understand the problem.)

@ricardoV94
Copy link
Member

ricardoV94 commented Jul 1, 2024

@brendan-m-murphy I think you're on the right track. The part where we overload operators was where I stopped at. Compiler wasn't happy with my changes to use the new real/complex accessors here https://github.com/pymc-devs/pytensor/blob/main/pytensor%2Fscalar%2Fbasic.py#L527-L538

Feel free to take a stab at it. You can push to this branch or if you don't have permissions start a new PR either on top of my branch or directly to main

@ricardoV94
Copy link
Member

You'll have to revert "Remove custom Complex type", that was just to investigate what part of the codebase actually depended on it

@jakirkham
Copy link
Contributor

jakirkham commented Jul 1, 2024

Maybe this table is helpful?

Also there is this note in the NumPy 2.0.0 release notes and this one on NumPy 1 & 2 compatibility. Namely NumPy 2 is just using C99's complex types

@brendan-m-murphy
Copy link

Maybe this table is helpful?

Also there is this note in the NumPy 2.0.0 release notes and this one on NumPy 1 & 2 compatibility. Namely NumPy 2 is just using C99's complex types

Yes, I think using C99 types is the reason for moving to npy_creal: the function for getting the real part of a complex double in C99 is creal, etc.

@brendan-m-murphy
Copy link

brendan-m-murphy commented Jul 8, 2024

Just an update: I've updated the code for ScalarType, which reduced the number of failing tests. (This is the PR on my fork: #905. It includes all the commits on this PR up to the commit that removed the complex variable code.)

There are still some failing tests related to the change to complex numbers in Numpy in AdvancedIncSubtensor1, around lines 2246-2289 in tensor/subtensor.py. I'm not sure what this part of the code is doing yet, although the changes for numpy 2.0 compatibility should still be replacing .real etc. with the right npy_creal function.

An issue I had with the ScalarType code, which is also going to a problem for the AdvancedIncSubTensor1 class is that the numpy C types npy_complex64 and npy_complex128 are aliases of things like npy_cfloat, npy_cdouble, npy_clongdouble, and the functions to get/set the real and imaginary parts are defined in terms of float, double, long double. So we need to "reverse" the typedefs to figure out if npy_complex64 is npy_cfloat or npy_cdouble etc.
(These typedefs are in numpy/_core/include/numpy/npy_common.h.)

The AdvancedIncSubtensor1 also defines code for npy_complex32, which as far as I can tell, isn't defined, so I guess we should remove that.

The updates I made will work for 32 and 64 bit machines, assuming that long double on the 32-bit machine is actually 64-bit. To get more options, I suppose we could copy numpy's set up of defining everything in terms of float, double, long double, then having set-up specific aliases for types in terms of the number of bits.

@brendan-m-murphy
Copy link

brendan-m-murphy commented Jul 9, 2024

I made some small changes to handle the typedefs for npy_cdouble etc. a bit smoother. I think those changes are pretty much ready to merge into this branch. There are a lot of mypy issues, but I suspect they're not related to my changes, which were mostly to the C code. I'll try to tidy it up and then request a review.

I think there are serious problems with the tensor.subtensor.py module. The "map iter" part of the Numpy C-API is no longer public, so I think the C code in AdvancedIncSubtensor1 will need to be change substantially. I can ask on the numpy mailing list what they would suggest. (The PR that removes is numpy/numpy#24274, which is part of the Numpy 2.0 release.)

There are some suggestions on that PR and this Aesara issue: aesara-devs/aesara#1506

I could try to work on this, but I suspect it would take me longer than the complex type updates.

@ricardoV94
Copy link
Member

ricardoV94 commented Jul 9, 2024

Thanks @brendan-m-murphy. For context the AdvancedIncSubtensor1 is a subset of Numpy advanced indexing that sets or increments a vector of values (possibly broadcasted) to a vector of integer indices along the first dimension of an array (and nothing else).

It is the only form of AdvancedIndexing implemented in C. Looking at the C implementation, it seems like quite a mess, and possibly overly complicated/inefficient. Like why is the set_or_inc done in the innermost function, and in a way that the compiler will never know which case it is?

Anyway, since we are trying to part ways with the C backend, I think we can make this operator C-implementation conditional on the numpy version that is installed? If we raise NotImplementedError from def c_code(...) for numpy >= 2.0, pytensor will just default to the python implementation instead.

Otherwise if someone feels like doing C-implementation, help is welcome. I think the only issue is replacing the MapIter logic

@ricardoV94
Copy link
Member

ricardoV94 commented Jul 9, 2024

@brendan-m-murphy regarding complex32, I also don't recall it being a thing these days. Just drop it

@ricardoV94
Copy link
Member

Last thing @brendan-m-murphy if possible, it would be great if the complex stuff worked both in numpy 1.x and 2.x. Maybe it does, just stating in case it was never brought up.

@brendan-m-murphy
Copy link

brendan-m-murphy commented Jul 9, 2024

Last thing @brendan-m-murphy if possible, it would be great if the complex stuff worked both in numpy 1.x and 2.x. Maybe it does, just stating in case it was never brought up.

It doesn't currently, but I think it's an easy change (I'm pretty sure the numpy release notes say how). I suppose I could make a PR against main then (I'd probably do a new branch for that with just the complex variable changes). Also, I could make the updates for complex variables to subtensor.py, so if the MapIter code gets fixed, then there's nothing left to patch up there.

@ricardoV94
Copy link
Member

We can tweak the test CI to run with both versions of numpy

@brendan-m-murphy
Copy link

Thanks @brendan-m-murphy. For context the AdvancedIncSubtensor1 is a subset of Numpy advanced indexing that sets or increments a vector of values (possibly broadcasted) to a vector of integer indices along the first dimension of an array (and nothing else).

It is the only form of AdvancedIndexing implemented in C. Looking at the C implementation, it seems like quite a mess, and possibly overly complicated/inefficient. Like why is the set_or_inc done in the innermost function, and in a way that the compiler will never know which case it is?

Anyway, since we are trying to part ways with the C backend, I think we can make this operator C-implementation conditional on the numpy version that is installed? If we raise NotImplementedError from def c_code(...) for numpy >= 2.0, pytensor will just default to the python implementation instead.

Otherwise if someone feels like doing C-implementation, help is welcome. I think the only issue is replacing the MapIter logic

So if the C code isn't available, the PerformLinker would be used for an Apply node, even if the mode says to use the CLinker?

This (rather old) commit seems to already implement a version of the fancy indexing increment using numpy for the "perform" function in : c60bd3b (all the C code is missing here, but that was moved into the class a few years ago).

@ricardoV94
Copy link
Member

ricardoV94 commented Jul 11, 2024

The linker doesn't change, but the default already handles mixing C and Python implementations. When an Op raises NotImplementedError for C code it builds a thunk around the python perform method instead.

@brendan-m-murphy
Copy link

Oh okay, so raising the NotImplementedError would use np.add.at, which seems to be the suggested replacement for using MapIter: aesara-devs/aesara#1506 (comment). (This is exactly what you said, but it took me a while to figure out what code might be used with the Python linker, and I didn't read far enough to see that the CLinker would default to that.)

I guess there could be a performance hit, but it sounds like the numpy team are working on improving np.add.at, so if we do have a slow down, maybe it is something they'll fix.

I'll try to get my PR in soon. I had to move to other things at work, but should be able to finish it in an evening or two.

@ricardoV94
Copy link
Member

The slowdown seems worthwhile specially if np.add.at gets faster (and we make sure to use it in the Python perform method). This code is pretty hard to maintain, specially with numpy deprecating it. I'm a bit puzzled that numpy C-API has nothing very useful for iterating over fancy indexes

@brendan-m-murphy
Copy link

Just an update: raising a not implemented error for the advanced inc subtensor when using numpy 2.0 removes the errors around the MapIter (as expected), but it revealed more errors related to the complex arithmetic. I haven't had a lot of time to look into these new errors, but I'll try to find out more this week.

@ricardoV94
Copy link
Member

Just an update: raising a not implemented error for the advanced inc subtensor when using numpy 2.0 removes the errors around the MapIter (as expected), but it revealed more errors related to the complex arithmetic. I haven't had a lot of time to look into these new errors, but I'll try to find out more this week.

Thanks 🙏 Hopefully we're getting there!

@brendan-m-murphy
Copy link

brendan-m-murphy commented Jul 29, 2024

Some of the tests in tests/tensor/random/rewriting are failing with: TypeError: Scalar(float32, shape=()) cannot store accurately value 1e-06, it would be represented as 9.999999974752427e-07. If you do not mind this precision loss, you can: 1) explicitly convert your data to a numpy array of dtype float32, or 2) set "allow_input_downcast=True" when calling "function".

This is for test_Subtensor_lift with param tuple 17, 18, and a few more.

The error comes from trying to create a TensorConstant from 1e-6. Ultimately, the check np.all(np.array(1e-6) == np.array(1e-6).astype('float32')) fails, leading to this error. If you do float(np.array(1e-6).astype('float32') or np.array(1e-6).astype('float32').astype('float64'), you get 9.999999974752427e-07.

Locally (on a macbook m2), this check fails with numpy 2.0 and numpy 1.26. I guess I don't see why this error shouldn't occur, since to get to this point in TensorType.filter (around line 214), we know that the data we're trying to convert is 1) a float, 2) not the same type as config.floatX, and 3) for this test, not a value that can be represented exactly in binary floating point. (The same problem doesn't happen if you replace 1e-6 by 2**-6.)

I haven't tested that the updates to complex numbers work for numpy 1.26 yet, but with numpy 2.0, basic tensor operations work with complex types now. The tests with the latest updates are here: https://github.com/brendan-m-murphy/pytensor/actions/runs/10144114724/job/28047006005?pr=2 I'll update the workflow to run against numpy 1.26 as well.

@ricardoV94
Copy link
Member

ricardoV94 commented Jul 29, 2024

That failure doesn't seem too worrisome, I guess it has to do with change in behavior of numpy regarding casting.

The float32 tests are always a bit of a pain around here. Feel free to cast explicitly to make the test pass or tweak the test parameters so we're not testing such large values. This is completely unrelated to the main goal of those tests

@brendan-m-murphy
Copy link

I'm still working through the failing tests. For test_inner_composite in tests/scalar/test_loop.py, there is a composite function that should always evaluate to 1, but when a float16 is used for the variable in the composite, there is an overflow. It seems like this test was designed to see if this is avoided, since for a smaller number of n_steps, the overflow doesn't happen, and n_steps is increased in this part. This doesn't fail on numpy 1.26. Maybe this has to do with the upcasting code?

For TestScan.test_inner_graph_cloning in tests/scan/test_basic.py, the test fails on numpy 2.0 as is, but if Mode(optimizer=None) is changed to Mode(linker="cvm") or Mode(linker="py"), then the test passes on numpy 2.0. It passes as is on numpy 1.26. (Actually, just removing the mode argument allows it to pass, but I guess this selects cvm by default?) For now, I've removed the mode argument, but maybe it is worth investigating further.

For the random variable tests, I had to replaced copy(rng) with deepcopy(rng). I'm not really sure why this needed to change, Generator isn't mentioned in the numpy 2.0 release notes (as far as I can tell).

@brendan-m-murphy
Copy link

In tests/tensor/basic.py, test_autocast_custom is failing because pt.as_tensor_variable(1.1) returns a float32 (here, using with autocast_float_as("float32", "float64") makes no difference. We have (pt.fvector() + 1.1).dtype == 'float32' in either case. It seems like the problem(?) is that 1.1 == np.asarray(1.1,).astype('float32') is true in numpy 2.0, but not in numpy 1.26. (In numpy 2.0,it evaluates to np.True_ and in numpy 1.26 it evaluates to False)

@brendan-m-murphy
Copy link

Maybe the answer to this last problem and the overflow problem are in NEP 50: https://numpy.org/neps/nep-0050-scalar-promotion.html

One other part of the tests with many failures is the rewriting tests for ops involving random variables (e.g. test_DimShuffle_lift). I haven't found any obvious problems here. When the rtol is too high, it is usually smaller values causing the problem. It's possible this is also a precision problem, but I haven't looked closely enough to know.

@ricardoV94
Copy link
Member

Do you have a branch with the most recent changes? We can start triaging the failing tests and see what's problematic vs inconsequential

@brendan-m-murphy
Copy link

Here is the most recent version: brendan-m-murphy#2

@brendan-m-murphy
Copy link

brendan-m-murphy commented Aug 7, 2024

I'll make a list of failing tests here (as I look into them):

  1. tests/tensor/special.py several failing tests; the first is TestSoftmax.test_perform. I fixed the error raised by using np.MAXDIMS, but now the values found for axis=None don't agree with those computed by scipy. This test case passes if the function is compiled with the "c" or "py" linkers, but fails on the "cvm" (default) linker. (Although if linker="c", then the test fails for axis = -1. For some reason, when axis=-1, the sums along axis=-1 aren't 1, but if you divide by the sums along axis -1, then you get the correct result.) UPDATE 1: now this fails for the default linker and the c linker for axis = -1. All other tests pass. The change was to update numpy to the latest version from the pinned version. UPDATE 2: the python code (scipy code written in terms of numpy) seems to be faster on a 200 x 300 matrix, although maybe there is an advantage to the C code if many ops are compiled together. UPDATE 3: This might be related to the problem where NPY_RAVEL_AXIS is -1, which should be fixed in the next numpy release.
  2. tests/scalar/test_loop.py: test_inner_composite fails because of an overflow when computing exp of a float16. This particular failure isn't important, but might be a symptom of numpy's changes to casting rules.
  3. tests/scan/test_basic.py: TestScan class, methods test_grad_one_output, test_grad_multple_outs, test_grad_multiple_outs_taps, test_grad_multiple_outs_taps_backwards
  4. tests/sparse/test_var.py: TestSparseVariable::test_unary[argmax/argmin-DenseTensorType-None-None], numpy AxisError, probably related to change from NPY_MAXDIMS to NPY_RAVEL_AXIS to indicate axis = None
  5. tests/sparse/test_var.py: TestSparseVariable::test_unary[cumsum/cumprod-DenseTensorType-None-None], CumOp attribute error for "c_axis". Update: fixed the "c_axis" problem, but now CumSum and CumProd fail due to an "OutputGuard" optimization (according to DebugMode). The error message is "ValueError: provided out is the wrong size for the accumulation". This problem occurs for dense tensors, so it's probably not to do with the sparse code. Update: the code to fix "c_axis" wasn't producing the correct value of NPY_RAVEL_AXIS, so I changed the c_code to use NPY_RAVEL_AXIS if self.axis is None. Now there is an error coming from a ufunc for accumulating sums... the error is due the value of a variable that numpy sets. I have a (slightly ugly) workaround, and I've raised an issue with Numpy. Update: there is a typo in numpy that sets NPY_RAVEL_AXIS to -1, which is causing this problem.
  6. tests/tensor/test_extra_ops.py::TestCumOp:test_grad/test_infer_shape/test_cum_op attribute error, missing "c_axis". Update: fixed "c_axis" error, now have same ValueError as case 5.
  7. tests/tensor/random/rewriting/test_basic.py: for test_Subtensor_lift params 25 and 28 TypeError __str__ returned an AxisError; for params 17, 18, 19, 20, 21, 22, 23, TypeError due to casting (cannot store 1e-6 as float32); for test_DimShuffle_lift, several failing parametric tests
  8. tests/tensor/random: test_random_maker_op in test_op.py, TestRandomGeneratorType in test_type.py has two failing methods, TestSharedRandomStream::test_tutorial is failing in test_utils.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-backend dependencies Pull requests that update a dependency file major NumPy compatibility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test on numpy 2.0
7 participants