Skip to content

Typing Protocols for Precise Type Hints in Python 3.12+

License

Notifications You must be signed in to change notification settings

jorenham/optype

optype

Building blocks for precise & flexible type hints.

optype - PyPI optype - Python Versions optype - license

optype - CI optype - pre-commit optype - basedpyright optype - ruff


Installation

Optype is available as optype on PyPI:

pip install optype

For optional NumPy support, it is recommended to use the numpy extra. This ensures that the installed numpy version is compatible with optype, following NEP 29 and SPEC 0.

pip install "optype[numpy]"

See the optype.numpy docs for more info.

Example

Let's say you're writing a twice(x) function, that evaluates 2 * x. Implementing it is trivial, but what about the type annotations?

Because twice(2) == 4, twice(3.14) == 6.28 and twice('I') = 'II', it might seem like a good idea to type it as twice[T](x: T) -> T: .... However, that wouldn't include cases such as twice(True) == 2 or twice((42, True)) == (42, True, 42, True), where the input- and output types differ. Moreover, twice should accept any type with a custom __rmul__ method that accepts 2 as argument.

This is where optype comes in handy, which has single-method protocols for all the builtin special methods. For twice, we can use optype.CanRMul[T, R], which, as the name suggests, is a protocol with (only) the def __rmul__(self, lhs: T) -> R: ... method. With this, the twice function can written as:

Python 3.10 Python 3.12+
from typing import Literal
from typing import TypeAlias, TypeVar
from optype import CanRMul

R = TypeVar('R')
Two: TypeAlias = Literal[2]
RMul2: TypeAlias = CanRMul[Two, R]

def twice(x: RMul2[R]) -> R:
    return 2 * x
from typing import Literal
from optype import CanRMul

type Two = Literal[2]
type RMul2[R] = CanRMul[Two, R]

def twice[R](x: RMul2[R]) -> R:
    return 2 * x

But what about types that implement __add__ but not __radd__? In this case, we could return x * 2 as fallback (assuming commutativity). Because the optype.Can* protocols are runtime-checkable, the revised twice2 function can be compactly written as:

Python 3.10 Python 3.12+
from optype import CanMul

Mul2: TypeAlias = CanMul[Two, R]
CMul2: TypeAlias = Mul2[R] | RMul2[R]

def twice2(x: CMul2[R]) -> R:
    if isinstance(x, CanRMul):
        return 2 * x
    else:
        return x * 2
from optype import CanMul

type Mul2[R] = CanMul[Two, R]
type CMul2[R] = Mul2[R] | RMul2[R]

def twice2[R](x: CMul2[R]) -> R:
    if isinstance(x, CanRMul):
        return 2 * x
    else:
        return x * 2

See examples/twice.py for the full example.

Reference

The API of optype is flat; a single import optype is all you need.

There are four flavors of things that live within optype,

  • optype.Can{} types describe what can be done with it. For instance, any CanAbs[T] type can be used as argument to the abs() builtin function with return type T. Most Can{} implement a single special method, whose name directly matched that of the type. CanAbs implements __abs__, CanAdd implements __add__, etc.
  • optype.Has{} is the analogue of Can{}, but for special attributes. HasName has a __name__ attribute, HasDict has a __dict__, etc.
  • optype.Does{} describe the type of operators. So DoesAbs is the type of the abs({}) builtin function, and DoesPos the type of the +{} prefix operator.
  • optype.do_{} are the correctly-typed implementations of Does{}. For each do_{} there is a Does{}, and vice-versa. So do_abs: DoesAbs is the typed alias of abs({}), and do_pos: DoesPos is a typed version of operator.pos. The optype.do_ operators are more complete than operators, have runtime-accessible type annotations, and have names you don't need to know by heart.

The reference docs are structured as follows:

Core functionality

All typing protocols here live in the root optype namespace. They are runtime-checkable so that you can do e.g. isinstance('snail', optype.CanAdd), in case you want to check whether snail implements __add__.

Unlikecollections.abc, optype's protocols aren't abstract base classes, i.e. they don't extend abc.ABC, only typing.Protocol. This allows the optype protocols to be used as building blocks for .pyi type stubs.

Builtin type conversion

The return type of these special methods is invariant. Python will raise an error if some other (sub)type is returned. This is why these optype interfaces don't accept generic type arguments.

operator operand
expression function type method type
complex(_) do_complex DoesComplex __complex__ CanComplex
float(_) do_float DoesFloat __float__ CanFloat
int(_) do_int DoesInt __int__ CanInt[R: int = int]
bool(_) do_bool DoesBool __bool__ CanBool[R: bool = bool]
bytes(_) do_bytes DoesBytes __bytes__ CanBytes[R: bytes = bytes]
str(_) do_str DoesStr __str__ CanStr[R: str = str]

Note

The Can* interfaces of the types that can used as typing.Literal accept an optional type parameter R. This can be used to indicate a literal return type, for surgically precise typing, e.g. None, True, and 42 are instances of CanBool[Literal[False]], CanInt[Literal[1]], and CanStr[Literal['42']], respectively.

These formatting methods are allowed to return instances that are a subtype of the str builtin. The same holds for the __format__ argument. So if you're a 10x developer that wants to hack Python's f-strings, but only if your type hints are spot-on; optype is you friend.

operator operand
expression function type method type
repr(_) do_repr DoesRepr __repr__ CanRepr[R: str = str]
format(_, x) do_format DoesFormat __format__ CanFormat[T: str = str, R: str = str]

Additionally, optype provides protocols for types with (custom) hash or index methods:

operator operand
expression function type method type
hash(_) do_hash DoesHash __hash__ CanHash
_.__index__() (docs) do_index DoesIndex __index__ CanIndex[R: int = int]

Rich relations

The "rich" comparison special methods often return a bool. However, instances of any type can be returned (e.g. a numpy array). This is why the corresponding optype.Can* interfaces accept a second type argument for the return type, that defaults to bool when omitted. The first type parameter matches the passed method argument, i.e. the right-hand side operand, denoted here as x.

operator operand
expression reflected function type method type
_ == x x == _ do_eq DoesEq __eq__ CanEq[T = object, R = bool]
_ != x x != _ do_ne DoesNe __ne__ CanNe[T = object, R = bool]
_ < x x > _ do_lt DoesLt __lt__ CanLt[T, R = bool]
_ <= x x >= _ do_le DoesLe __le__ CanLe[T, R = bool]
_ > x x < _ do_gt DoesGt __gt__ CanGt[T, R = bool]
_ >= x x <= _ do_ge DoesGe __ge__ CanGe[T, R = bool]

Binary operations

In the Python docs, these are referred to as "arithmetic operations". But the operands aren't limited to numeric types, and because the operations aren't required to be commutative, might be non-deterministic, and could have side-effects. Classifying them "arithmetic" is, at the very least, a bit of a stretch.

operator operand
expression function type method type
_ + x do_add DoesAdd __add__ CanAdd[T, R]
_ - x do_sub DoesSub __sub__ CanSub[T, R]
_ * x do_mul DoesMul __mul__ CanMul[T, R]
_ @ x do_matmul DoesMatmul __matmul__ CanMatmul[T, R]
_ / x do_truediv DoesTruediv __truediv__ CanTruediv[T, R]
_ // x do_floordiv DoesFloordiv __floordiv__ CanFloordiv[T, R]
_ % x do_mod DoesMod __mod__ CanMod[T, R]
divmod(_, x) do_divmod DoesDivmod __divmod__ CanDivmod[T, R]
_ ** x
pow(_, x)
do_pow/2 DoesPow __pow__ CanPow2[T, R]
CanPow[T, None, R, Never]
pow(_, x, m) do_pow/3 DoesPow __pow__ CanPow3[T, M, R]
CanPow[T, M, Never, R]
_ << x do_lshift DoesLshift __lshift__ CanLshift[T, R]
_ >> x do_rshift DoesRshift __rshift__ CanRshift[T, R]
_ & x do_and DoesAnd __and__ CanAnd[T, R]
_ ^ x do_xor DoesXor __xor__ CanXor[T, R]
_ | x do_or DoesOr __or__ CanOr[T, R]

Note

Because pow() can take an optional third argument, optype provides separate interfaces for pow() with two and three arguments. Additionally, there is the overloaded intersection type CanPow[T, M, R, RM] =: CanPow2[T, R] & CanPow3[T, M, RM], as interface for types that can take an optional third argument.

Reflected operations

For the binary infix operators above, optype additionally provides interfaces with reflected (swapped) operands, e.g. __radd__ is a reflected __add__. They are named like the original, but prefixed with CanR prefix, i.e. __name__.replace('Can', 'CanR').

operator operand
expression function type method type
x + _ do_radd DoesRAdd __radd__ CanRAdd[T, R]
x - _ do_rsub DoesRSub __rsub__ CanRSub[T, R]
x * _ do_rmul DoesRMul __rmul__ CanRMul[T, R]
x @ _ do_rmatmul DoesRMatmul __rmatmul__ CanRMatmul[T, R]
x / _ do_rtruediv DoesRTruediv __rtruediv__ CanRTruediv[T, R]
x // _ do_rfloordiv DoesRFloordiv __rfloordiv__ CanRFloordiv[T, R]
x % _ do_rmod DoesRMod __rmod__ CanRMod[T, R]
divmod(x, _) do_rdivmod DoesRDivmod __rdivmod__ CanRDivmod[T, R]
x ** _
pow(x, _)
do_rpow DoesRPow __rpow__ CanRPow[T, R]
x << _ do_rlshift DoesRLshift __rlshift__ CanRLshift[T, R]
x >> _ do_rrshift DoesRRshift __rrshift__ CanRRshift[T, R]
x & _ do_rand DoesRAnd __rand__ CanRAnd[T, R]
x ^ _ do_rxor DoesRXor __rxor__ CanRXor[T, R]
x | _ do_ror DoesROr __ror__ CanROr[T, R]

Note

CanRPow corresponds to CanPow2; the 3-parameter "modulo" pow does not reflect in Python.

According to the relevant python docs:

Note that ternary pow() will not try calling __rpow__() (the coercion rules would become too complicated).

Inplace operations

Similar to the reflected ops, the inplace/augmented ops are prefixed with CanI, namely:

operator operand
expression function type method types
_ += x do_iadd DoesIAdd __iadd__ CanIAdd[T, R]
CanIAddSelf[T]
_ -= x do_isub DoesISub __isub__ CanISub[T, R]
CanISubSelf[T]
_ *= x do_imul DoesIMul __imul__ CanIMul[T, R]
CanIMulSelf[T]
_ @= x do_imatmul DoesIMatmul __imatmul__ CanIMatmul[T, R]
CanIMatmulSelf[T]
_ /= x do_itruediv DoesITruediv __itruediv__ CanITruediv[T, R]
CanITruedivSelf[T]
_ //= x do_ifloordiv DoesIFloordiv __ifloordiv__ CanIFloordiv[T, R]
CanIFloordivSelf[T]
_ %= x do_imod DoesIMod __imod__ CanIMod[T, R]
CanIModSelf[T]
_ **= x do_ipow DoesIPow __ipow__ CanIPow[T, R]
CanIPowSelf[T]
_ <<= x do_ilshift DoesILshift __ilshift__ CanILshift[T, R]
CanILshiftSelf[T]
_ >>= x do_irshift DoesIRshift __irshift__ CanIRshift[T, R]
CanIRshiftSelf[T]
_ &= x do_iand DoesIAnd __iand__ CanIAnd[T, R]
CanIAndSelf[T]
_ ^= x do_ixor DoesIXor __ixor__ CanIXor[T, R]
CanIXorSelf[T]
_ |= x do_ior DoesIOr __ior__ CanIOr[T, R]
CanIOrSelf[T]

These inplace operators usually return itself (after some in-place mutation). But unfortunately, it currently isn't possible to use Self for this (i.e. something like type MyAlias[T] = optype.CanIAdd[T, Self] isn't allowed). So to help ease this unbearable pain, optype comes equipped with ready-made aliases for you to use. They bear the same name, with an additional *Self suffix, e.g. optype.CanIAddSelf[T].

Unary operations

operator operand
expression function type method types
+_ do_pos DoesPos __pos__ CanPos[R]
CanPosSelf
-_ do_neg DoesNeg __neg__ CanNeg[R]
CanNegSelf
~_ do_invert DoesInvert __invert__ CanInvert[R]
CanInvertSelf
abs(_) do_abs DoesAbs __abs__ CanAbs[R]
CanAbsSelf

Rounding

The round() built-in function takes an optional second argument. From a typing perspective, round() has two overloads, one with 1 parameter, and one with two. For both overloads, optype provides separate operand interfaces: CanRound1[R] and CanRound2[T, RT]. Additionally, optype also provides their (overloaded) intersection type: CanRound[T, R, RT] = CanRound1[R] & CanRound2[T, RT].

operator operand
expression function type method type
round(_) do_round/1 DoesRound __round__/1 CanRound1[T = int]
round(_, n) do_round/2 DoesRound __round__/2 CanRound2[T = int, RT = float]
round(_, n=...) do_round DoesRound __round__ CanRound[T = int, R = int, RT = float]

For example, type-checkers will mark the following code as valid (tested with pyright in strict mode):

x: float = 3.14
x1: CanRound1[int] = x
x2: CanRound2[int, float] = x
x3: CanRound[int, int, float] = x

Furthermore, there are the alternative rounding functions from the math standard library:

operator operand
expression function type method type
math.trunc(_) do_trunc DoesTrunc __trunc__ CanTrunc[R = int]
math.floor(_) do_floor DoesFloor __floor__ CanFloor[R = int]
math.ceil(_) do_ceil DoesCeil __ceil__ CanCeil[R = int]

Almost all implementations use int for R. In fact, if no type for R is specified, it will default in int. But technially speaking, these methods can be made to return anything.

Callables

Unlike operator, optype provides the operator for callable objects: optype.do_call(f, *args. **kwargs).

CanCall is similar to collections.abc.Callable, but is runtime-checkable, and doesn't use esoteric hacks.

operator operand
expression function type method type
_(*args, **kwargs) do_call DoesCall __call__ CanCall[**Pss, R]

Note

Pyright (and probably other typecheckers) tend to accept collections.abc.Callable in more places than optype.CanCall. This could be related to the lack of co/contra-variance specification for typing.ParamSpec (they should almost always be contravariant, but currently they can only be invariant).

In case you encounter such a situation, please open an issue about it, so we can investigate further.

Iteration

The operand x of iter(_) is within Python known as an iterable, which is what collections.abc.Iterable[V] is often used for (e.g. as base class, or for instance checking).

The optype analogue is CanIter[R], which as the name suggests, also implements __iter__. But unlike Iterable[V], its type parameter R binds to the return type of iter(_) -> R. This makes it possible to annotate the specific type of the iterable that iter(_) returns. Iterable[V] is only able to annotate the type of the iterated value. To see why that isn't possible, see python/typing#548.

The collections.abc.Iterator[V] is even more awkward; it is a subtype of Iterable[V]. For those familiar with collections.abc this might come as a surprise, but an iterator only needs to implement __next__, __iter__ isn't needed. This means that the Iterator[V] is unnecessarily restrictive. Apart from that being theoretically "ugly", it has significant performance implications, because the time-complexity of isinstance on a typing.Protocol is $O(n)$, with the $n$ referring to the amount of members. So even if the overhead of the inheritance and the abc.ABC usage is ignored, collections.abc.Iterator is twice as slow as it needs to be.

That's one of the (many) reasons that optype.CanNext[V] and optype.CanNext[V] are the better alternatives to Iterable and Iterator from the abracadabra collections. This is how they are defined:

operator operand
expression function type method type
next(_) do_next DoesNext __next__ CanNext[V]
iter(_) do_iter DoesIter __iter__ CanIter[R: CanNext[Any]]

For the sake of compatibility with collections.abc, there is optype.CanIterSelf[V], which is a protocol whose __iter__ returns typing.Self, as well as a __next__ method that returns T. I.e. it is equivalent to collections.abc.Iterator[V], but without the abc nonsense.

Awaitables

The optype is almost the same as collections.abc.Awaitable[R], except that optype.CanAwait[R] is a pure interface, whereas Awaitable is also an abstract base class (making it absolutely useless when writing stubs).

operator operand
expression method type
await _ __await__ CanAwait[R]

Async Iteration

Yes, you guessed it right; the abracadabra collections made the exact same mistakes for the async iterablors (or was it "iteramblers"...?).

But fret not; the optype alternatives are right here:

operator operand
expression function type method type
anext(_) do_anext DoesANext __anext__ CanANext[V]
aiter(_) do_aiter DoesAIter __aiter__ CanAIter[R: CanAnext[Any]]

But wait, shouldn't V be a CanAwait? Well, only if you don't want to get fired... Technically speaking, __anext__ can return any type, and anext will pass it along without nagging (instance checks are slow, now stop bothering that liberal). For details, see the discussion at python/typeshed#7491. Just because something is legal, doesn't mean it's a good idea (don't eat the yellow snow).

Additionally, there is optype.CanAIterSelf[R], with both the __aiter__() -> Self and the __anext__() -> V methods.

Containers

operator operand
expression function type method type
len(_) do_len DoesLen __len__ CanLen[R: int = int]
_.__length_hint__() (docs) do_length_hint DoesLengthHint __length_hint__ CanLengthHint[R: int = int]
_[k] do_getitem DoesGetitem __getitem__ CanGetitem[K, V]
_.__missing__() (docs) do_missing DoesMissing __missing__ CanMissing[K, D]
_[k] = v do_setitem DoesSetitem __setitem__ CanSetitem[K, V]
del _[k] do_delitem DoesDelitem __delitem__ CanDelitem[K]
k in _ do_contains DoesContains __contains__ CanContains[K = object]
reversed(_) do_reversed DoesReversed __reversed__ CanReversed[R], or
CanSequence[K: CanIndex, V]

Because CanMissing[K, D] generally doesn't show itself without CanGetitem[K, V] there to hold its hand, optype conveniently stitched them together as optype.CanGetMissing[K, V, D=V].

Similarly, there is optype.CanSequence[K: CanIndex | slice, V], which is the combination of both CanLen and CanItem[I, V], and serves as a more specific and flexible collections.abc.Sequence[V].

Attributes

operator operand
expression function type method type
v = _.k or
v = getattr(_, k)
do_getattr DoesGetattr __getattr__ CanGetattr[K: str = str, V = Any]
_.k = v or
setattr(_, k, v)
do_setattr DoesSetattr __setattr__ CanSetattr[K: str = str, V = Any]
del _.k or
delattr(_, k)
do_delattr DoesDelattr __delattr__ CanDelattr[K: str = str]
dir(_) do_dir DoesDir __dir__ CanDir[R: CanIter[CanIterSelf[str]]]

Context managers

Support for the with statement.

operator operand
expression method(s) type(s)
__enter__ CanEnter[C], or CanEnterSelf
__exit__ CanExit[R = None]
with _ as c: __enter__, and
__exit__
CanWith[C, R=None], or
CanWithSelf[R=None]

CanEnterSelf and CanWithSelf are (runtime-checkable) aliases for CanEnter[Self] and CanWith[Self, R], respectively.

For the async with statement the interfaces look very similar:

operator operand
expression method(s) type(s)
__aenter__ CanAEnter[C], or
CanAEnterSelf
__aexit__ CanAExit[R=None]
async with _ as c: __aenter__, and
__aexit__
CanAsyncWith[C, R=None], or
CanAsyncWithSelf[R=None]

Descriptors

Interfaces for descriptors.

operator operand
expression method type
v: V = T().d
vt: VT = T.d
__get__ CanGet[T: object, V, VT = V]
T().k = v __set__ CanSet[T: object, V]
del T().k __delete__ CanDelete[T: object]
class T: d = _ __set_name__ CanSetName[T: object, N: str = str]

Buffer types

Interfaces for emulating buffer types using the buffer protocol.

operator operand
expression method type
v = memoryview(_) __buffer__ CanBuffer[T: int = int]
del v __release_buffer__ CanReleaseBuffer

Standard libs

copy

For the copy standard library, optype provides the following interfaces:

operator operand
expression method type
copy.copy(_) __copy__ CanCopy[R: object]
copy.deepcopy(_, memo={}) __deepcopy__ CanDeepcopy[R: object]
copy.replace(_, **changes: V) (Python 3.13+) __replace__ CanReplace[V, R]

And for convenience, there are the runtime-checkable aliases for all three interfaces, with R bound to Self. These are roughly equivalent to:

type CanCopySelf = CanCopy[CanCopySelf]
type CanDeepcopySelf = CanDeepcopy[CanDeepcopySelf]
type CanReplaceSelf[V] = CanReplace[V, CanReplaceSelf[V]]

pickle

For the pickle standard library, optype provides the following interfaces:

method(s) signature (bound) type
__reduce__ () -> R CanReduce[R: str | tuple = str | tuple]
__reduce_ex__ (CanIndex) -> R CanReduceEx[R: str | tuple = str | tuple]
__getstate__ () -> S CanGetstate[S: object]
__setstate__ (S) -> None CanSetstate[S: object]
__getnewargs__
__new__
() -> tuple[*Vs]
(*Vs) -> Self
CanGetnewargs[*Vs]
__getnewargs_ex__
__new__
() -> tuple[tuple[*Vs], dict[str, V]]
(*Vs, **dict[str, V]) -> Self
CanGetnewargsEx[*Vs, V]

dataclasses

For the dataclasses standard library, optype provides the HasDataclassFields[V: Mapping[str, Field]] interface. It can conveniently be used to check whether a type or instance is a dataclass, i.e. isinstance(obj, optype.HasDataclassFields).

NumPy

Optype supports both NumPy 1 and 2. The current minimum supported version is 1.24, following NEP 29 and SPEC 0.

When using optype.numpy, it is recommended to install optype with the numpy extra, ensuring version compatibility:

pip install "optype[numpy]"

Note

For the remainder of the optype.numpy docs, assume that the following import aliases are available.

from typing import Any, Literal
import numpy as np
import numpy.typing as npt
import optype.numpy as onp

For the sake of brevity and readability, the PEP 695 and PEP 696 type parameter syntax will be used, which is supported since Python 3.13.

Arrays

Array

Optype provides the generic onp.Array type alias for np.ndarray. It is similar to npt.NDArray, but includes two (optional) type parameters: one that matches the shape type (ND: tuple[int, ...]), and one that matches the scalar type (ST: np.generic). It is defined as:

type Array[
    ND: tuple[int, ...] = tuple[int, ...],
    ST: np.generic = Any,
] = np.ndarray[ND, np.dtype[ST]]

Note that the shape type parameter ND matches the type of np.ndarray.shape, and the scalar type parameter ST that of np.ndarray.dtype.type.

This way, a vector can be typed as Array[tuple[int]], and a $2 \times 2$ matrix of integers as Array[tuple[Literal[2], Literal[2]], np.integer[Any]].

AnyArray

Something that can be used to construct a numpy array is often referred to as an array-like object, usually annotated with npt.ArrayLike. But there are two main problems with npt.ArrayLike:

  1. Its name strongly suggests that it only applies to arrays. However, "0-dimensional" are also included, i.e. "scalars" such as bool, and complex, but also str, since numpy considers unicode- and bytestrings to be "scalars". So a: npt.ArrayLike = 'array lie' is a valid statement.
  2. There is no way to narrow the allowed scalar-types, since it's not generic. So instances of bytes and arrays of np.object_ are always included.

AnyArray[ND, ST, PY] doesn't have these problems through its (optional) generic type parameters:

type AnyArray[
    # shape type
    ND: tuple[int, ...] = tuple[int, ...],
    # numpy scalar type
    ST: np.generic = np.generic,
    # Python builtin scalar type
    # (note that `complex` includes `bool | int | float`)
    PT: complex | str | bytes = complex | str | bytes,
]

Note

Unlike npt.ArrayLike, onp.AnyArray does not include the python scalars (PT) directly.

This makes it possible to correctly annotate e.g. a 1-d arrays-like of floats as a: onp.AnyArray[tuple[int], np.floating[Any], float].

CanArray*
type signature method purpose
CanArray[
    ND: tuple[int, ...] = ...,
    ST: np.generic = ...,
]
__array__() Turning itself into a numpy array.
CanArrayFunction[
    F: CanCall[..., Any] = ...,
    R: object = ...,
]
__array_function__() Similar to how T.__abs__() implements abs(T), but for arbitrary numpy callables (that aren't a ufunc).
CanArrayFinalize[
    T: object = ...,
]
__array_finalize__() Converting the return value of a numpy function back into an instance of the foreign object.
CanArrayWrap
__array_wrap__() Takes a `np.ndarray` instance, and "wraps" it into itself, or some `ndarray` subtype.
HasArray*
optype.numpy._ attribute purpose
HasArrayInterface[
    V: Mapping[str, Any] = dict[str, Any],
]
__array_interface__ The array interface protocol (V3) is used to for efficient data-buffer sharing between array-like objects, and bridges the numpy C-api with the Python side.
HasArrayPriority
__array_priority__ In case an operation involves multiple sub-types, this value determines which one will be used as output type.

Scalars

Optype considers the following numpy scalar types:

  • np.generic
    • np.bool_ (or np.bool with numpy >= 2)
    • np.object_
    • np.flexible
      • np.void
      • np.character
        • np.bytes_
        • np.str_
    • np.number[N: npt.NBitBase]
      • np.integer[N: npt.NBitBase]
        • np.unsignedinteger[N: npt.NBitBase]
          • np.ubyte
          • np.ushort
          • np.uintc
          • np.uintp
          • np.ulong
          • np.ulonglong
          • np.uint{8,16,32,64}
        • np.signedinteger[N: npt.NBitBase]
          • np.byte
          • np.short
          • np.intc
          • np.intp
          • np.long
          • np.longlong
          • np.int{8,16,32,64}
      • np.inexact[N: npt.NBitBase]
        • np.floating[N: npt.NBitBase]
          • np.half
          • np.single
          • np.double
          • np.longdouble
          • np.float{16,32,64}
        • np.complexfloating[N1: npt.NBitBase, N2: npt.NBitBase]
          • np.csingle
          • np.cdouble
          • np.clongdouble
          • np.complex{64,128}

See the docs for more info.

Scalar

The optype.numpy.Scalar interface is a generic runtime-checkable protocol, that can be seen as a "more specific" np.generic, both in name, and from a typing perspective. Its signature looks like

Scalar[
    # The "Python type", so that `Scalar.item() -> PT`.
    PT: object,
    # The "N-bits" type (without having to deal with`npt.NBitBase`).
    # It matches `SCalar.itemsize: NB`.
    NB: int = Any,
]

It can be used as e.g.

are_birds_real: Scalar[bool, Literal[1]] = np.bool_(True)
the_answer: Scalar[int, Literal[2]] = np.uint16(42)
fine_structure_constant: Scalar[float, Literal[8]] = np.float64(1) / 137

Note

The second type argument for itemsize can be omitted, which is equivalent to setting it to Any.

Any*Value

For every (standard) numpy scalar type (i.e. subtypes of np.generic), there is the optype.numpy.Any{}Value alias (where {} should be replaced with the title-cased name of the scalar, without potential trailing underscore).

So for np.bool_ there's onp.AnyBoolValue, for np.uint8 there's onp.AnyUInt8Value, and for np.floating[N: npt.NBitBase] there's AnyFloating[N: npt.NBitBase].

Note

The extended-precision scalar types (e.g. np.int128, np.float96 and np.complex512) are not included, because their availability is platform-dependent.

When a value of type Any{}Value is passed to e.g. np.array, the resulting np.ndarray will have a scalar type that matches the corresponding Any{}Value. For instance, passing x: onp.AnyFloat64Value as np.array(x) returns an array of type onp.Array[tuple[()], np.float64] (where tuple[()] implies that its shape is ()).

Each Any{}Value contains at least the relevant np.generic subtype, zero or more ctypes types, and zero or more of the Python builtins types.

So for instance type AnyUInt8 = np.uint8 | ct.c_uint8, and type AnyCDouble = np.cdouble | complex.

Any*Type

In the same way as Any*Value, there's a Any*Type for each of the numpy scalar types.

These type aliases describe what's allowed to be passed to e.g. the np.dtype[ST: np.generic] constructor, so that its scalar type ST matches the one corresponding to the passed Any*Type.

So for example, if some x: onp.UInt8 is passed to np.dtype(x), then the resulting type will be a np.dtype[np.uint8].

This is useful when annotating an (e.g. numpy) function with a dtype parameter, e.g. np.arange. Then by using a @typing.overload for each of the allowed scalar types, it's possible to annotate it in the most specific way that's possible, whilst keeping the code readable and maintainable.

Data type objects

In NumPy, a dtype (data type) object, is an instance of the numpy.dtype[ST: np.generic] type. It's commonly used to convey metadata of a scalar type, e.g. within arrays.

DType

Because the type parameter of np.dtype isn't optional, it could be more convenient to use the alias optype.numpy.DType, which is defined as:

type DType[ST: np.generic = Any] = np.dtype[ST]

Apart from the "CamelCase" name, the only difference with np.dtype is that the type parameter can be omitted, in which case it's equivalent to np.dtype[np.generic], but shorter.

HasDType

Many of numpy's public functions accept an (optional) dtype argument. But here, the term "dtype" has a broader meaning, as it also accepts (a subtype of) np.generic. Additionally, any instance with a dtype: DType attribute is accepted. The runtime-checkable interface for this is optype.numpy.HasDType, which is roughly equivalent to the following definition:

@runtime_checkable
class HasDType[DT: DType = DType](Protocol):
    dtype: Final[DT]

Since np.ndarray has a dtype attribute, it is a subtype of HasDType:

>>> isinstance(np.array([42]), onp.HasDType)
True
AnyDType

All types that can be passed to the np.dtype constructor, as well as the types of most dtype function parameters, are encapsulated within the optype.numpy.AnyDType alias, i.e.:

type AnyDType[ST: np.generic = Any] = type[ST] | DType[ST] | HasDType[DType[ST]]

Note

NumPy's own numpy.typing.DTypeLike alias serves the same purpose as AnyDType. But npt.DTypeLike has several issues:

  • It's not generic (accepts no type parameter(s)), and cannot be narrowed to allow for specific scalar types. Even though most functions don't accept all possible scalar- and dtypes.
  • Its definition is maximally broad, e.g. type[Any], and str are included in its union. So given some arbitrary function parameter dtype: npt.DTypeLike, passing e.g. dtype="Ceci n'est pas une dtype" won't look like anything out of the ordinary for your type checker.

These issues aren't the case for optype.numpy.AnyDType. However, it (currently) isn't possible to pass scalar char-codes (e.g. dtype='f8') or builtin python types (e.g. dtype=int) directly. If you really want to do so anyway, then just pass it to the np.dtype() constructor, e.g. np.arange(42, dtype=np.dtype('f8')).

Universal functions

A large portion of numpy's public API consists of universal functions, i.e. (callable) instances of np.ufunc.

Tip

Custom ufuncs can be created using np.frompyfunc, but also through a user-defined class that implements the required attributes and methods (i.e., duck typing).

AnyUFunc

But np.ufunc has a big issue; it accepts no type parameters. This makes it very difficult to properly annotate its callable signature and its literal attributes (e.g. .nin and .identity).

This is where optype.numpy.AnyUFunc comes into play: It's a runtime-checkable generic typing protocol, that has been thoroughly type- and unit-tested to ensure compatibility with all of numpy's ufunc definitions. Its generic type signature looks roughly like:

AnyUFunc[
    # The type of the (bound) `__call__` method.
    Fn: Callable[..., Any] = Any,
    # The types of the `nin` and `nout` (readonly) attributes.
    # Within numpy these match either `Literal[1]` or `Literal[2]`.
    Nin: int = Any,
    Nout: int = Any,
    # The type of the `signature` (readonly) attribute;
    # Must be `None` unless this is a generalized ufunc (gufunc), e.g.
    # `np.matmul`.
    Sig: str | None = Any,
    # The type of the `identity` (readonly) attribute (used in `.reduce`).
    # Unless `Nin: Literal[2]`, `Nout: Literal[1]`, and `Sig: None`,
    # this should always be `None`.
    # Note that `complex` also includes `bool | int | float`.
    Id: complex | str | bytes | None = Any,
]

Note

Unfortunately, the extra callable methods of np.ufunc (at, reduce, reduceat, accumulate, and outer), are incorrectly annotated (as None attributes, even though at runtime they're methods that raise a ValueError when called). This currently makes it impossible to properly type these in optype.numpy.AnyUFunc; doing so would make it incompatible with numpy's ufuncs.

CanArrayUFunc

When ufuncs are called on some inputs, the ufunc will call the __array_ufunc__, which is not unlike how abs() calls __abs__.

With optype.numpy.CanArrayUFunc, it becomes straightworward to annotate potential arguments to np.ufunc. It's a single-method runtime-checkable protocol, whose type signature looks roughly like:

CanArrayUFunc[
    Fn: AnyUFunc = Any,
]

Note

Due to the previously mentioned typing limitations of np.ufunc, the *args and **kwargs of CanArrayUFunc.__array_ufunc__ are currently impossible to properly annotate.