Higher-Kinded TypeVars
tek opened this issue · comments
aka type constructors, generic TypeVars
Has there already been discussion about those?
I do a lot of FP that results in impossible situations because of this. Consider an example:
A = TypeVar('A')
B = TypeVar('B')
F = TypeVar('F')
class M(Generic[F[X], A]):
def x(fa: F[A], f: Callable[[A], B]) -> F[B]:
return map(f, fa)
M().x([1], str)
I haven't found a way to make this work, does anyone know a trick or is it impossible?
If not, consider the syntax as a proposal.
Reference implementations would be Haskell, Scala.
optimally, the HK's type param would be indexable as well, allowing for F[X[X, X], X[X]]
Summary of current status (by @smheidrich, 2024-02-08):
- @JelleZijlstra has indicated interest in sponsoring a PEP, conditional on a prototype implementation in a major type checker and a well-specified draft PEP.
- Drafting the PEP takes place in @nekitdev's fork of the
peps
repo. The stub PEP draft so far contains a few examples of the proposed syntax. - That same repo's GitHub Discussions forum forum has been designated as the place to discuss the PEP (and presumably the prototype implementation?). Some limited further discussions have taken place there.
- If you want to be notified of new discussion threads, I think you have to set the whole repo as "watched" in GitHub?
I think this came up few times in other discussions, for example one use case is python/mypy#4395. But TBH this is low priority, since such use cases are quite rare.
damn, I searched very thoroughly but did not find this one! 😄
So, consider this my +1!
are you approving the feature?
awfully pragmatic. Where's your sense of adventure? 😄
anyways, I'll work on it, though it's gonna take a while to get into the project.
Similar to microsoft/TypeScript#1213
Not sure if the discussion over there provides any useful insights to the effort over here.
Hi @tek. I'm also very interested in this, so I'd like to ask if you had any progress with this and volunteer to help if you want.
@rcalsaverini sorry, I've been migrating my legacy code to haskell and am abandoning python altogether. but I wish you great success!
Oh, sad to hear but I see your point. Thanks.
Just to add another use case (which I think relates to this issue):
Using Literal types along with overloading __new__
, along with higher-kinded typevars could allow implementing a generic "nullable" ORM Field class, using a descriptor to provide access to the appropriate nullable-or-not field values. The descriptor wouldn't have to be reimplemented in subclasses.
It is one step closer to being possible due to the most recent mypy release's support for honoring the return type of __new__
(python/mypy#1020).
Note: this is basically a stripped-down version of Django's Field class:
# in stub file
from typing import Generic, Optional, TypeVar, Union, overload, Type
from typing_extensions import Literal
_T = TypeVar("_T", bound="Field")
_GT = TypeVar("_GT")
class Field(Generic[_GT]):
# on the line after the overload: error: Type variable "_T" used with arguments
@overload
def __new__(cls: Type[_T], null: Literal[False] = False, *args, **kwargs) -> _T[_GT]: ...
@overload
def __new__(cls: Type[_T], null: Literal[True], *args, **kwargs) -> _T[Optional[_GT]]: ...
def __get__(self, instance, owner) -> _GT: ...
class CharField(Field[str]): ...
class IntegerField(Field[int]): ...
# etc...
# in code
class User:
f1 = CharField(null=False)
f2 = CharField(null=True)
reveal_type(User().f1) # Expected: str
reveal_type(User().f2) # Expected: Union[str, None]
I wonder if this is what I need or if there's currently a work around for my (slightly simpler) case?:
I'm building an async redis client with proper type hints. I have a "Commands" class with methods for all redis commands (get
, set
, exists
, strlen
... and hundreds more). Normally each of those methods should return a future (actually coroutine) to the result, but in pipeline mode they should all return None
- the commands are added to the pipeline to be executed later.
This is easy enough to implement in python, but not so easy to type hint correctly.
Basic example:
class Redis:
def execute(self, command) -> Coroutine[Any, Any, Union[None, str, int, float]]:
return self.connection.execute(...)
def get(self, *args) -> Coroutine[Any, Any, str]:
...
return self.execute(command)
def set(self, *args) -> Coroutine[Any, Any, None]:
...
return self.execute(command)
def exists(self, *args) -> Coroutine[Any, Any, bool]:
...
return self.execute(command)
# ... and many MANY more ...
class RedisPipeline(Redis):
def execute(self, command) -> None:
self.pipeline.append(command)
I tried numerous options to make Coroutine[Any, Any, xxx]
generic, but nothing seems to work.
Is there any way around this with python 3.8 and latest mypy? If not a solution would be wonderful - as far as I can think, my only other route for proper types is a script which copy and pastes the entire class and changes the return types in code.
@samuelcolvin I don't think this question belongs in this issue. The reason for the failure (knowing nothing about Redis but going purely by the code you posted) is that in order to make this work, the base class needs to switch to an Optional
result, i.e.
def execute(self, command) -> Optional[Coroutine[Any, Any, Union[None, str, int, float]]]:
I get that, but I need all the public methods to definitely return a coroutine. Otherwise, if it returned an optional coroutine, it would be extremely annoying to use.
What I'm trying to do is modify the return type of many methods on the sub-classes, including "higher kind" types which are parameterised.
Hence thinking it related to this issue.
Honestly I have no idea what higher-kinded type vars are -- my eyes glaze over when I hear that kind of talk. :-)
I have one more suggestion, then you're on your own. Use a common base class that has an Optional[Coroutine[...]]
return type and derive both the regular Redis class and the RedisPipeline class from it.
Okay, so the simple answer is that what I'm trying to do isn't possible with python types right now.
Thanks for helping - at least I can stop my search.
humm, but the example above under "Basic example" I would argue IS type-safe.
All the methods which end return self.execute(...)
return what execute
returns - either a Coroutine
or None
.
Thus I don't see how this as any more "unsafe" than normal use of generics.
@gvanrossum, I can relate!
I wonder if bidict provides a practical example of how this issue prevents expressing a type that you can actually imagine yourself needing.
>>> element_by_atomicnum = bidict({0: "hydrogen", 1: "helium"})
>>> reveal_type(element_by_atomicnum) # bidict[int, str]
# So far so good, but now consider the inverse:
>>> element_by_atomicnum.inverse
bidict({"hydrogen": 0, "helium": 1})
What we want is for mypy to know this:
>>> reveal_type(element_by_atomicnum.inverse) # bidict[str, int]
merely from a type hint that we could add to a super class. It would parameterize not just the key type and the value type, but also the self type. In other words, something like:
KT = TypeVar('KT')
VT = TypeVar('VT')
class BidirectionalMapping(Mapping[KT, VT]):
...
def inverse(self) -> $SELF_TYPE[VT, KT]:
...
where $SELF_TYPE
would of course use some actually legal syntax that allowed composing the self type with the other parameterized types.
Okay, I think that example is helpful. I recreated it somewhat simpler (skipping the inheritance from Mapping
and the property decorators):
from abc import abstractmethod
from typing import *
T = TypeVar('T')
KT = TypeVar('KT')
VT = TypeVar('VT')
class BidirectionalMapping(Generic[KT, VT]):
@abstractmethod
def inverse(self) -> BidirectionalMapping[VT, KT]:
...
class bidict(BidirectionalMapping[KT, VT]):
def __init__(self, key: KT, val: VT):
self.key = key
self.val = val
def inverse(self) -> bidict[VT, KT]:
return bidict(self.val, self.key)
b = bidict(3, "abc")
reveal_type(b) # bidict[int, str]
reveal_type(b.inverse()) # bidict[str, int]
This passes but IIUC you want the ABC to have a more powerful type. I guess here we might want to write it as
def inverse(self: T) -> T[VT, KT]: # E: Type variable "T" used with arguments
Have I got that?
Exactly! It should be possible to e.g. subclass bidict (without overriding inverse), and have mypy realize that calling inverse on the subclass gives an instance of the subclass (with the key and value types swapped as well).
This isn’t only hypothetically useful, it’d really be useful in practice for the various subclasses in the bidict library where this actually happens (frozenbidict, OrderedBidict, etc.).
Glad this example was helpful! Please let me know if there’s anything further I can do to help here, and (can’t help myself) thanks for creating Python, it’s such a joy to use.
Ah, so the @abstractmethod
is also a red herring.
And now I finally get the connection with the comment that started this issue.
But I still don't get the connection with @samuelcolvin's RedisPipeline class. :-(
I would also say that this example is really simple, common, but not supported:
def create(klass: Type[T], value: K) -> T[K]:
return klass(value)
We use quite a lot of similar constructs in dry-python/returns
.
As a workaround I am trying to build a plugin with emulated HKT, just like in some other languages where support of it is limited. Like:
- Swift and
bow
: https://bow-swift.io/docs/fp-concepts/higher-kinded-types/ - TypeScript and
fp-ts
: https://github.com/gcanti/fp-ts/blob/master/src/HKT.ts
Paper on "Lightweight higher-kinded polymorphism": https://www.cl.cam.ac.uk/~jdy22/papers/lightweight-higher-kinded-polymorphism.pdf
TLDR: So, instead of writing T[K]
we can emulate this by using HKT[T, K]
where HKT
is a basic generic instance processed by a custom mypy
plugin. I am working on this plugin for already some time now, but there's still nothing to show. You can track the progress here: https://pypi.org/project/kinds/ (part of dry-python libraries)
But I still don't get the connection with @samuelcolvin's RedisPipeline class. :-(
Sorry if I wasn't clear. I'll try again to explain:
I have a class with many (~200) methods, they all return coroutines with different result types (None
, bytes
, str
, int
or float
). I want a subclass with the same internal logic but where all those methods return None
- I can do this in python, but not with type hints currently (here's the actual code if it helps)
So roughly I want:
T = TypeVar('T', bytes, str, int, float, 'None')
Result = Coroutine[Any, Any, T]
class Foo:
def method_1(self, *args) -> Result[str]:
...
def method_2(self, *args) -> Result[None]:
...
def method_3(self, *args) -> Result[bool]:
...
...
...
def method_200(self, *args) -> Result[int]:
...
class Bar:
def method_1(self, *args) -> None:
...
def method_2(self, *args) -> None:
...
def method_3(self, *args) -> None:
...
...
...
def method_200(self, *args) -> None:
...
Except I don't want to have to redefine all the methods on Bar
. Assuming I could create my own AlwaysNone
type which even when parameterised told mypy the result would always be None
class AlwaysNoneMeta(type):
def __getitem__(self, item) -> None:
return None
class AlwaysNone(metaclass=AlwaysNoneMeta):
pass
I think the feature requested here could solve my problem.
In other words if mypy could understand
def generate_cls(OuterType):
class TheClass:
def method_1(self, *args) -> OuterType[str]:
...
...
return TheClass
Foo = generate_cls(Result)
Bar = Generate_cls(AlwaysNone)
I'd be in the money, but quite understandably it can't.
I've currently got a working solution where i generate a .pyi
stub file with definitions for all these methods but changing the return type to None
(see here) so I'm not in immediate need of this anymore.
It looks like you want completely different signatures whose likeness is limited to their names and arguments:
async def do_work(redis_client: Redis):
await redit_sclient.get(...)
do_work(RedisPipeline()) # user expects type checker to warn about misuse
It's understandable you're looking to reuse some common "template", but I fail to see what it has to do with higher-kinded types.
Hi @Kentzo, that's incorrect, these are not completely different signatures.
Please review the code here; as you can see these are functions which may return an awaitable object, not coroutines. This approach and similar, while not necessarily obvious to beginners, is relatively common in libraries which make extensive use of asyncio and the standard library asyncio code.
If you're not sure how it works, feel free to submit an issue on that project and I'll endeavour to explain it to you.
In your initial example:
class Redis:
def execute(self, command) -> Coroutine[Any, Any, Union[None, str, int, float]]:
...
class RedisPipeline(Redis):
def execute(self, command) -> None:
These are different signatures. Functions annotated to accept Redis
as an argument would expect execute
to return a coroutine.
This is off-topic, please create an issue on async-redis if you want to discuss this more.
In short, yes they're different signatures, but I think my explanation above gives enough detail on what I'm doing and why I think it relates to this issue.
+1 on this. It's hard to say the use cases are rare when it's not even possible to express... There may be countless use cases hiding behind duck typing that simply can't be checked by the type system and therefore certainly aren't recognized as such. Having a type safe generic "Mappable" class as shown in the OP would be really useful.
It's hard to say the use cases are rare when it's not even possible to express
That makes no sense. You can express the OP's example just fine in untyped Python, you just can't express the types correctly. So if there were "countless" real examples we would encounter frequent user requests for type system extensions to allow typing this correctly. I haven't seen a lot of those.
Do you have an actual (non-toy) example?
You actually can have Higher Kinded Types in Python with full mypy
support.
Here's how Mappable
looks like from OP:
- interface
Mappable
akaFunctor
: https://github.com/dry-python/returns/blob/master/returns/interfaces/mappable.py - tests for it: https://github.com/dry-python/returns/blob/master/typesafety/test_interfaces/test_mappable/test_inheritance.yml
- real type implementing this interface: https://github.com/dry-python/returns/blob/f144affb73da17f5efbcb244507632452df92d24/returns/result.py#L89
@sobolevn That's awesome! Thank you!
@gvanrossum Inexpressible in the type system is what I was referring to. My point was that writing this off as rare seems premature since Python's types system is not a hard stop on your ability to keep going. So it might be actually quite common and how would we know? Certainly you're right that it's not a common request though which is a good point and slightly different than mine. However, it's all moot now because it can be done! Awesome!
Do you have an actual (non-toy) example?
An example would be the dict.fromkeys(...)
method, as the output type is variable with respect to both the dict
(sub-)class as well as the key type (see python/typeshed#3800). This example above can, admittedly, be more or less resolved by explicitly re-annotating aforementioned method in every single subclass.
Another less easily resolved example one would be the np.asanyarray(a, dtype=..., ...)
function in numpy. Per the documentation: "Convert the input to an ndarray, but pass ndarray subclasses through.".
What this means is that, similar to dict.fromkeys(...)
, the output is variable with respect to both any passed ndarray
subclasses (the a
parameter) as well as the type of the embedded scalars (the dtype
parameter).
Examples
In [1]: import numpy as np
In [2]: class TestArray(np.ndarray):
...: ...
...:
In [3]: array: TestArray = np.array([0, 1, 2]).view(TestArray)
In [4]: array # array of integers
Out[4]: TestArray([0, 1, 2])
In [5]: np.asanyarray(array, dtype=float) # array of floats
Out[5]: TestArray([0., 1., 2.])
Ran into this today:
from typing import Any, Generic, Mapping, TypeVar
ProblemState = TypeVar('ProblemState')
Action = TypeVar('Action')
class Problem(Generic[ProblemState, Action]):
pass
ProblemT = TypeVar('ProblemT', bound=Problem[ProblemState, Action])
class Solution(Generic[ProblemT, ProblemState, Action]):
pass
I could have worked around it with an intersection operator:
ProblemU = TypeVar('ProblemU', bound=Problem[Any, Any])
class Solution(Generic[ProblemU, ProblemState, Action]):
ProblemT = Intersection[ProblemU, Problem[ProblemState, Action]]
Would intersection help in the general case without requiring as much effort?
Just ran into this limitation when trying to type methods of a generic class that return an instance of the same generic class but with other arguments. This is actually very common for classes that can easily have some properties changed, like numpy arrays or torch tensors and their device
(gpu or cpu) or dtype
(int
, float
, ...):
DeviceTransferableT = TypeVar("DeviceTransferableT", bound="DeviceTransferable")
class DeviceTransferable(Generic[DeviceT]):
def to_device(self: DeviceTransferableT, device: Type[DeviceT]) -> DeviceTransferableT[DeviceT]: # error!
... # some great implementation
class SomeOtherClass(DeviceTransferable):
"""Would prefer not to re-implement `to_device` in every inheriting class."""
Is there any other solution?
@florensacc Can you expand the example so that I can actually run it by mypy? A few things are missing and I can’t guess what their definitions should be.
Here's a full example, which throws two errors: error: Type variable "DeviceTransferableT" used with arguments
and Argument 1 to "f" has incompatible type "SomeOtherClass[Device0]"; expected "SomeOtherClass[Device1]"
.
import attr
import abc
from typing import Type, TypeVar, Generic, ClassVar
class Device(abc.ABC):
device_id: ClassVar[int]
class Device0(Device):
device_id = 0
class Device1(Device):
device_id = 1
DeviceT = TypeVar("DeviceT", bound=Device)
AnotherDeviceT = TypeVar("AnotherDeviceT", bound=Device)
DeviceTransferableT = TypeVar("DeviceTransferableT", bound="DeviceTransferable")
@attr.s()
class DeviceTransferable(Generic[DeviceT]):
device: DeviceT = attr.ib()
def to_device(self: DeviceTransferableT, device: Type[AnotherDeviceT]) -> DeviceTransferableT[AnotherDeviceT]:
return attr.evolve(self, device=device)
@attr.s()
class SomeOtherClass(DeviceTransferable[DeviceT]):
"""Would prefer not to re-implement `to_device` in every inheriting class."""
pass
C_in0 = SomeOtherClass(device=Device0())
C_in1 = C_in0.to_device(Device1)
def f(c: SomeOtherClass[Device1]) -> None:
print(f"The device of c is {c.device}, but its type is still SomeOtherClass[Device0]!")
f(C_in1)
Oh, @attr.s :-(
Can you do it without?
sure, you can remove all attr.s stuff and use this class definition:
class DeviceTransferable(Generic[DeviceT]):
def __init__(self, device: DeviceT):
self.device = device
def to_device(self: DeviceTransferableT, device: Type[AnotherDeviceT]) -> DeviceTransferableT[AnotherDeviceT]:
return type(self)(device=device)
without the "argument" to the TypeVar
this code runs happily and prints my message:
The device of c is <class '__main__.Device1'>, but its type is still SomeOtherClass[Device0]!
Okay, I get it. If/when we fix this we'll make sure to have a test like that. Thanks!
We have released a first prototype of Higher Kinded Types emulation for Python.
It is available for everyone to try! Quick demo: https://gist.github.com/sobolevn/7f8ffd885aec70e55dd47928a1fb3e61
Here's how a function's signature will look like:
from returns.primitives.hkt import Kind1, kinded
_InstanceKind = TypeVar('_InstanceKind', bound='HasValue')
@kinded
def apply_function(
instance: Kind1[_InstanceKind, _ValueType],
callback: Callable[[_ValueType], _NewValueType],
) -> Kind1[_InstanceKind, _NewValueType]:
...
Source code: https://github.com/dry-python/returns
Docs: https://returns.readthedocs.io/en/latest/pages/hkt.html and https://sobolevn.me/2020/10/higher-kinded-types-in-python
See ceph/ceph#38953 for a real world use case
Finally found this issue after running into TypeError: 'TypeVar' object is not subscriptable
and searching around for the right way to do this. Turns out there isn't. Has there been any change in status for the implementation? The use-case might be limited for the annotations themselves, but if utilized in a couple core libraries, the end-user experience could be greatly improved by the resulting hints in IDEs.
I empathize with @tek, but unfortunately can't leave Python for the strongly-typed shores of Scala or Haskell, due to Python's ML ecosystem and relatively easy learning curve.
Hm, I believe I've finally run into this. Here's my example, let me know if I'm misunderstanding something.
Let's say you're adding static types to a Mongo ODM/ORM-type framework. You define your model using a class.
@dataclass
class User:
__collection__ = "users" # This is the "table", or collection in Mongo. Class variable, not instance
_id: ObjectId # Every Mongo document has an ID, most often ObjectId, but can be maybe str or something else
username: str # A random property
Now you write a function to query for this model. We write a generic protocol for models, parametrizing by the ID type:
ID = TypeVar("ID")
class MongoModel(Protocol[ID])
__collection__: str
@property
def _id(self) -> ID:
...
Our User
class is thus a MongoModel
. Now the actual querying function. We can query just passing in the ID.
C = TypeVar("C", bound=MongoModel)
def find_one(model_cls: Type[C], id: ???) -> C: ...
There's nothing to put instead of ???
, is there?
My first instinct was to try:
C = TypeVar("C", bound=MongoModel)
TID = TypeVar("TID")
def find_one(model_cls: Type[C[TID], id: TID) -> C: ...
But no dice, obviously.
Second approach: I hoped Mypy would realize MongoModel
already references ID
(since it's defined with it), so I can write this:
C = TypeVar("C", bound=MongoModel)
def find_one(model_cls: Type[C], id: ID) -> C: ...
But no, Mypy treats them as two separate type variables (in other words, I can pass in an int and it doesn't catch it).
And C
cannot be changed to TypeVar("C", bound=MongoModel[ID])
either.
Just to extend the list of people encountering it, I also ran into this. My example is similar to the ones above in spirit but aggravated by extraction of the generic class at runtime, which is possible since python 3.8. I am not sure whether it is officially recommended but I really like this mechanism :). Unfortunately, using it will very often require subscriptable TypeVars, like in the sketch below.
For simplicity I slightly changed the real signature. In real use cases I load the registry from a file and not from prepared list, so polymorphism makes a bit more sense there.
class SettingsRegistry(Generic[TSettings], ABC):
@classmethod
def from_serialized_settings(cls: Type[TSettingsRegistry], serialized_settings: List[str]) -> TSettingsRegistry[TSettings]:
settings_class = get_args(cls.__orig_bases__[0])[0]
deserialized_settings = [settings_class.deserialize(setting) for setting in serialized_settings]
return cls(deserialized_settings)
This way I can avoid code duplication but typing TSettingsRegistry[TSettings]
is currently disallowed, of course.
Just another use for this: use the new TypeGuard
to check the types of a collection:
T = TypeVar("T", bound=Collection)
V = TypeVar("V")
def all_elements_type(
collection: T[object],
element_type: Type[V],
) -> TypeGuard[T[V]]:
return all(isinstance(t, element_type) for t in collection)
This is very useful and currently not possible AFAIK.
Another potential use case example for HKTs:
https://toolz.readthedocs.io/en/latest/_modules/toolz/dicttoolz.html#valmap
I've made my own typed version of it, like so:
K = TypeVar("K")
V = TypeVar("V")
M = TypeVar("M")
def valmap(
func: Callable[[V], M],
d: Mapping[K, V],
factory: Type[MutableMapping[K, M]] = dict,
) -> MutableMapping[K, M]:
rv = factory()
rv.update(zip(d.keys(), map(func, d.values())))
return rv
but in order to preserve type information, I would much rather want to type it as:
K = TypeVar("K")
V = TypeVar("V")
M = TypeVar("M")
F = TypeVar("F", bound=MutableMapping)
def valmap(
func: Callable[[V], M],
d: Mapping[K, V],
factory: Type[F] = dict,
) -> F[K, M]:
rv = factory()
rv.update(zip(d.keys(), map(func, d.values())))
return rv
@gvanrossum makes the point that this isn't a commonly requested feature. I think that is correct to some extent, although the level of activity in this thread is making me slightly reconsider. I think it isn't commonly requested because it is a fairly advanced technique. But, I think my example, as well as many other examples in this thread are showing that fundamental library code in fairly prevalent use would be able to benefit from HKT in the sense that annotations would more accurately describe actual behavior. Often times this library code is maintained by skilled programmers that could leverage HKT if available. This is a benefit that then trickles down to the wider python ecosystem, including users that have never heard of HKT, but might leverage type checkers. You don't need to be a wizard to enjoy more accurate lib function call return types.
@harahu - That is an excellent example. I think you described the situation perfectly. I've been in the exact same spot of wanting to write a transformation over a map-like collection, and resorting to some zany nested type signatures, when really all I wanted to do was turn some D[K, V]
into a F[K, M]
based on some provided type F
and a transform func(v: V) -> M
.
HKT is a fairly sophisticated tool that many folks do not even know to reach for, but when you are in a situation which benefits it, many doors are unlocked. HKTs often greatly simplify the type signature, like in the example above. It shows up most frequently when you want to "transmute" some object (typically a container type) into a target (container) type, but you want the internal types to be preserved.
This pattern does not seem to show up frequently when writing application code. Most software engineers probably do not need to know about it if they don't want to. However, when writing library code for others to consume, the pattern shows up quite a lot. Probably because it's a fairly high level abstraction, and that kind of abstraction lends itself to reuse.
In my opinion, HKTs would push the envelop for maintainability and ergonomics for maintainers of libraries that embrace the static typing discipline. In the same vein, it would not add any cognitive overhead for those looking to implement libraries, nor those looking to start writing libraries with static types. If anything, I think it has the potential to shorten the learning curve for those with medium experience with type systems (such as myself). I have definitely gotten myself into some really convoluted type signatures trying to express the transformation I wanted. Desire to use HKTs seems to scale with familiarity with type systems and abstracting over generics.
I think it has the potential to shorten the learning curve for those with medium experience with type systems (such as myself).
💯 When you have generics but not HKT, you end up having to twist yourself into knots trying to express what would otherwise be simple. HKT is merely generics that works everywhere instead of generics that only works in some places.
I am having two similar protocols:
class A(Protocol):
def foo(self)-> int:...
def bar(self)-> bool:...
class AsyncA(Protocol):
def foo(self)-> Awaitable[int]:...
def bar(self)-> Awaitable[bool]:...
Then only difference is that in second case there are async functions. I would be glad to unite them using Generics. But of cause this does not work:
AT = TypeVar("AT", Union, Awaitable)
class A(Protocol[AT]):
def foo(self) -> AT[int]: ...
def bar(self) -> AT[bool]: ...
hey folks, another ORM use case coming in here, this system is in fact what we would have used for sqlalchemy2-stubs had it existed to encapsulate our structure of ColumnElement[TypeEngine[_T]], but as it's not, SQLAlchemy 2.0's typing will likely do what the original sqlalchemy-stubs did which is consider it to be ColumnElement[_T]. this solves most of the practical problems at the expense of looking a little funny and breaking the hierarchy a bit.
references for me at:
I just found out about the Self
type that was recently added to Python 3.11. Could that be extended to add support for higher-kinded types?
import typing as t
KT = t.TypeVar('KT')
VT = t.TypeVar('VT')
class BidirectionalMapping(t.Mapping[KT, VT]):
@property
def inverse(self: t.Self[KT, VT]) -> t.Self[VT, KT]:
...
cc @JelleZijlstra @pradeep90 @Gobot1234 et al.
Using parameters on Self is explicitly disallowed by the PEP, so that seems like a non-starter unless the justifications can be addressed.
Thanks. If you're talking about the
In such cases, we recommend using an explicit type for self
in the https://peps.python.org/pep-0673/#use-in-generic-classes section, this recommendation doesn't work for the use case here. Since no specific mention of higher-kinded types is mentioned in the PEP, perhaps this use case wasn't even in mind when the PEP was written?
I meant
Note that we reject using Self with type arguments, such as Self[int]. This is because it creates ambiguity about the type of the self parameter and introduces unnecessary complexity:
I just found out about the
Self
type that was recently added to Python 3.11. Could that be extended to add support for higher-kinded types?import typing as t KT = t.TypeVar('KT') VT = t.TypeVar('VT') class BidirectionalMapping(t.Mapping[KT, VT]): @property def inverse(self: t.Self[KT, VT]) -> t.Self[VT, KT]: ...cc @JelleZijlstra @pradeep90 @Gobot1234 et al.
Yes I would be in support of this, it was something I considered but as HKT wasn't/isn't a thing I didn't think it would be a good idea to include in the PEP
I agree with @pradeep90. I think this would create significant confusion and complexity. It would also deviate from the Self
type in other languages.
@pradeep90 writes:
In your case, it would look like: ..
I think there might be a misunderstanding? I am already doing what you proposed, and it does not help with this issue, as I commented here previously.
@erictraut writes:
I agree with @pradeep90. I think this would create significant confusion and complexity. It would also deviate from the Self type in other languages.
Good to know, thank you. Do you have any ideas about how support for HKTs could be added in a way that would match other languages that already support them?
@pradeep90 writes:
In your case, it would look like: ..
I think there might be a misunderstanding? I am already doing what you proposed, and it does not help with this issue, as I commented here previously.
I hadn't read your earlier comment in the thread because I mistakenly assumed this was a fresh (typing-sig) thread.
The snippet I shared works in both Pyre and Mypy (playground link). The same goes for your existing code, where the self
is not annotated.
If you're still having trouble, could you repro your case in a Mypy playground snippet?
@pradeep90, please reread #548 (comment). Your playground link does not exercise the functionality this is asking for. You haven't created a subclass of BidirectionalMapping. If you do, you will see that its inherited inverse
undesirably returns a BidirectionalMapping, rather than an instance of the subclass.
Do you have any ideas about how support for HKTs could be added in a way that would match other languages that already support them?
I don't consider this an example of HKT. I think of Self
as a type that is implicitly specialized with the type parameters of the class that encloses it. In your example, it's effectively bound to the type BidirectionalMapping[KT, VT]
. Specializing it a second time therefore doesn't make sense, any more than it would make sense to re-specialize list[int]
.
What would you consider this, then, and should I open a separate issue for that once I know what to call it?
You haven't created a subclass of BidirectionalMapping. If you do, you will see that its inherited
inverse
undesirably returns a BidirectionalMapping, rather than an instance of the subclass.
Guido's thoughtful snippet demonstrates what you were trying to get at and is the kind of repro I was asking for (mypy playground, for any future readers trying to follow).
Yeah, this is one of the self
cases where you'd need to add explicit stubs to the subclass methods, like you do in frozenbidict
. The exact same thing would occur for the generic Container
example in the PEP: if we subclassed it, we would need to explicitly specify the subclass method's return type.
If we eventually get HKTs in the type system, that would have to be the solution here rather than co-opting the Self
type. As @erictraut mentioned, Self
is intended to be a simple replacement for a handwritten TypeVar
and helps with almost all cases except the ones in generic classes where we need to tinker with the generic type parameters (as in the PEP's Container
or your BidirectionalMapping
).
Maybe I'm missing something but if Self is meant to act like a shorthand for a bound TypeVar so surely the example which could be written as:
SelfBidirectionalMapping = TypeVar("SelfBidirectionalMapping", bound=BidirectionalMapping[Any, Any])
class BidirectionalMapping(t.Mapping[KT, VT]):
@property
def inverse(self: SelfBidirectionalMapping[KT, VT]) -> SelfBidirectionalMapping[VT, KT]:
...
m: BidirectionalMapping[int, str]
y = m.inverse
reveal_type(y) # => `BidirectionalMapping[str, int]`
Could be written using Self as shown in #548 (comment)?
Thanks @Gobot1234, but the SelfBidirectionalMapping[KT, VT]
in that snippet (and similarly the SelfBidirectionalMapping[VT, KT]
) cause error: Type variable "SelfBidirectionalMapping" used with arguments
.
In any case, I don't think this is getting the job done.
Here's a more complete example based on your snippet, demonstrating that we do need HKTs (or some other not-yet-implemented feature) to make this use case work:
from collections import UserDict
import typing as t
KT = t.TypeVar('KT')
VT = t.TypeVar('VT')
SelfBidirectionalMapping = t.TypeVar("SelfBidirectionalMapping", bound=BidirectionalMapping[t.Any, t.Any])
class BidirectionalMapping(t.Mapping[KT, VT]):
@property
def inverse(self: SelfBidirectionalMapping[KT, VT]) -> SelfBidirectionalMapping[VT, KT]: # results in "error: Type variable "SelfBidirectionalMapping" used with arguments"
...
class bidict(UserDict[KT, VT], BidirectionalMapping[KT, VT]):
pass
b = bidict({'answer': 42})
reveal_type(b) # correct: bidict[str, int]
reveal_type(b.inverse) # wrong: bidict[str, int] (should be: bidict[int, str])
mypy-play link here: https://mypy-play.net/?mypy=latest&python=3.7&gist=5272ebc066abb4b6b32df0a336c78b00&flags=strict
Thanks @Gobot1234, but the
SelfBidirectionalMapping[KT, VT]
in that snippet (and similarly theSelfBidirectionalMapping[VT, KT]
) causeerror: Type variable "SelfBidirectionalMapping" used with arguments
.In any case, I don't think this is getting the job done.
Here's a more complete example based on your snippet:
from collections import UserDict import typing as t KT = t.TypeVar('KT') VT = t.TypeVar('VT') SelfBidirectionalMapping = t.TypeVar("SelfBidirectionalMapping", bound=BidirectionalMapping[t.Any, t.Any]) class BidirectionalMapping(t.Mapping[KT, VT]): @property def inverse(self: SelfBidirectionalMapping[KT, VT]) -> SelfBidirectionalMapping[VT, KT]: # results in "error: Type variable "SelfBidirectionalMapping" used with arguments" ... class bidict(UserDict[KT, VT], BidirectionalMapping[KT, VT]): pass b = bidict({'answer': 42}) reveal_type(b) # correct: bidict[str, int] reveal_type(b.inverse) # wrong: bidict[str, int] (should be: bidict[int, str])
This is meant to use HKT is it not? The error (currently) is to be expected
Going way back, I think there's a potential workaround for the Awaitable-or-None case, but it requires code reorganization because of the lack of higher-kinded types. Something like
# Assume Command[T] is some kind of hacky parametric newtype or container
class Redis:
def get(self, *args) -> Command[str]:
...
return Command[str](command)
def set(self, *args) -> Command[None]:
...
return Command[None](command)
def exists(self, *args) -> Command[bool]:
...
return Command[bool](command)
# ... and many MANY more ...
class ConnectionExecutor:
def execute(self, command: Command[T]) -> Coroutine[Any, Any, T]:
return self.connection.execute(...)
class PipelineExecutor:
def execute(self, command: Command[T]) -> None:
self.pipeline.append(command)
And then you can have an executor
variable and a redis
variable, and instead of redis.set(*args) # type: ???
, you have executor.execute(redis.set(*args))
, which knows what the return type is because we've got a sensible value from the Redis
object's method, passed into the executor
, which has a known type.
(I don't know if it makes sense to still have the Redis
class in that case, or to just have a module with a bunch of top-level functions.)
(Anyway, note that this means that the wrapper methods explicitly can't wrap multiple calls or anything like that, but writing methods like that is already broken under the original code, so...)
Assuming it still makes sense for all of those classes to exist, then the HKT issue can be made to re-manifest by saying "I'd rather have the Redis
and Executor
classes combined into one."
I tried to sketch out what typing that would look like, and it got so messy that I'm sure I was overcomplicating it somehow, so I'm going to stop for now.
I just recently came across this issue myself and realized that there could be an extremely simple fix using intersection types assuming they would work with type variables.
Example:
from collections import UserList
from typing import List, TypeVar
T = TypeVar("T")
ListType = TypeVar("ListType", bound=List)
def append(L: ListType & List[T], x: T) -> ListType & List[T]:
L.append(x)
return L
L: UserList[int] = append(UserList[int](), 0) # Should pass, as it is both UserList and List[int].
In this case, ListType & List[T]
is the desired behavior for ListType[T]
, the HKT desired in this thread.
This would however require type intersections to support generics i.e. UserList[int] == UserList & List[int]
.
@SimpleArt that's neat and I can imagine that would work in some cases, but I don't believe it solves the full problem. For example, unless I'm mistaken, you can't use it in functor*
class Functor(ABC):
def map(self: F[A], f : Callable[[A], B]) -> F[B]:
...
because the type variable F
can't be applied to type variables A
or B
in the way List
can in your example.
*someone tell me if I've got functor wrong - i'm not clear what it looks like in python
@joelberkeley It would, I'm suggesting F[A]
in your example be interpreted as F & F.bound[A]
, where F.bound
is the bound for your type variable e.g. TypeVar("F", bound=List)
.
In this case it looks like F
is the previously mentioned Self
type. Fully written out, your example would be:
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Callable, Generic, TypeVar
A = TypeVar("A")
B = TypeVar("B")
Self = TypeVar("Self", bound="Functor")
T = TypeVar("T")
class Functor(ABC, Generic[T]):
@abstractmethod
def map(self: Self & Functor[A], f: Callable[[A], B]) -> Self & Functor[B]:
...
Self
ensures functorB = functorA.map(f)
makes functorB
retains the same subclass as functorA
.
Functor[A] -> Functor[B]
ensures functorB = functorA.map(f)
makes functorB
retains the correct type variable instead of just Any
.
F
doesn't have a bound in my code, but maybe let's wait for someone else to chime in because I'm not very confident in my definition of Functor
(specifically, whether Functor
should be generic in F
, the "container type" if you will).
What I do believe, however, is that Functor
isn't generic in T
(the "element type" if you will) so you wouldn't be able to write Functor[A]
. Indeed, I think it's core to the definition of Functor
that it's not generic in T
@joelberkeley Perhaps we are misunderstanding each other. The self
parameter must* have type Functor
in your example, that's how Python methods work. That means Functor
must be a subtype of F[A]
, which according to what you inherited that won't be the case
Perhaps what you really wanted was another separate argument for the F[A]
? Or like I suggested, it is Functor
which should be Generic
? Or maybe you meant Functor[F[A]]
?
I think what's confusing me is that I'm trying to inherit Functor in its container type, rather than make instances of it (which is the more usual approach). In that case, Functor
is like
class Functor(ABC, Generic[F]):
def map(self, f : Callable[[A], B], xs : F[A]) -> F[B]:
...
class ListFunctor(Functor[List]):
def map(self, f : Callable[[A], B], xs : List[A]) -> List[B]:
return [f(x) for x in xs]
strs : List[str] = ListFunctor().map(str, [1, 2])
Hopefully that clears up the confusion.
@joelberkeley Then this is what it would look like with my suggestion:
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Callable, Generic, List, TypeVar
T = TypeVar("T")
class Kind1(Generic[T]):
"""A generic with 1 parameter."""
pass
A = TypeVar("A")
B = TypeVar("B")
F = TypeVar("F", bound=Kind1)
class Functor(ABC, Generic[F]):
@abstractmethod
def map(self: Functor[F], f: Callable[[A], B], xs: F & Kind1[A]) -> F & Kind1[B]:
...
This ensures the following types should be preserved:
def map(self: Functor[F], f: Callable, xs: F) -> F:
def map(self: Functor, f: Callable[[A], B], xs: Kind1[A]) -> Kind1[B]:
I believe that by bounding F
on Kind1
, you're implicitly recreating the List
example. For example, you can't do this with types you don't own, which don't subclass Kind1
, including stdlib types. That might be resolvable with a Protocol
but you'll have other problems if you try to do that such as Protocol
s requiring covariant type parameters. I don't believe it's appropriate to require covariance in a general functor.
Interestingly, some linters such as mypy
already treat self
as an intersection of types. Here's a minimal example:
from typing import Generic, TypeVar
Self = TypeVar("Self", bound="Parent")
T1 = TypeVar("T1")
T2 = TypeVar("T2")
class Parent(Generic[T1]): # Binds `self: Parent[T1]` to methods implicitly.
def add(self: Self, x: T1) -> Self: # Binds `self: Self` tightly to subclasses.
...
class Child(Parent[T1], Generic[T1, T2]):
...
x = Child[int, str]() # Child[int, str]
y = x.add(0) # Passes, and `Child[int, str]` still.
z = x.add("abc") # Fails, incompatible `A[int]` with `x: str`.
In this example, when doing x.add
, self
is implicitly bound as Parent[int]
from the Parent
class itself. It is also simultaneously bound as Child[int, str]
via Self
. In this way, we have two types being bound to self
simultaneously.
I'm currently writing a PEP to propose HKTs with the args
argument to the TypeVar
. Any suggestions are welcomed!
Here's how a functor would look like:
from typing import Callable, Protocol, TypeVar
F = TypeVar("F", bound="Functor[T]", args=1)
T = TypeVar("T", covariant=True)
U = TypeVar("U")
class Functor(Protocol[T]):
def map(self: F[T], function: Callable[[T], U]) -> F[U]:
...
Following the bidict
example, one could do:
from collections import UserDict
from typing import Mapping, TypeVar
BM = TypeVar("BM", bound="BidirectionalMapping[K, V]", args=2)
K = TypeVar("K")
V = TypeVar("V")
class BidirectionalMapping(Mapping[K, V]):
@property
def inverse(self: BM[K, V]) -> BM[V, K]:
...
class bidict(UserDict[K, V], BidirectionalMapping[K, V]):
...
bd = bidict(value=42)
reveal_type(bd) # bidict[str, int]
reveal_type(bd.inverse) # bidict[int, str]
The create
example mentioned by @sobolevn is a bit more difficult to express, though, since T
is not bound to anything, therefore referring to its type constructor is not exactly trivial.
In general, any code that needs to be generic over types and type constructors needs HKTs in one way or another.
As the simplest example, the wrapping type:
from typing import Callable, Generic, TypeVar
from attrs import frozen
W = TypeVar("W", bound="Wrap[T]", args=1)
T = TypeVar("T", covariant=True)
U = TypeVar("U")
@frozen()
class Wrap(Generic[T]):
value: T
def map(self: W[T], function: Callable[[T], U]) -> W[U]:
return type(self)(function(self.value))
I'm pretty sure it's possible to run into issues related to how __new__
is handled by type checkers, though, which might add onto the resulting complexity.
It's not clear to me how this would compose with PEP 695 if both were to be accepted. In particular, if the notion of defining TypeVars as global variables were deprecated, how would your proposal work? You will need to consider the scoping rules carefully.
It may be instructive to look at other languages that support HKTs. You'll notice that I did a survey of type parameter syntax for other languages as part of PEP 695 (see Appendix A). I think it would be useful to do a similar survey of other languages that support HKTs. To my knowledge, very few of them do. It would be interesting to understand why that's the case — and why you think this added level of complexity is justified in the Python type system when most languages have not gone this route. I've seen a few examples of where it might be useful, but they strike me as edge cases. I'd want to see evidence that such a feature would be useful to more than just a handful of users because it will not be easy to implement such a feature within the various Python type checkers.
Indeed, PEP 695 does make this feature much more complex to implement.
@nekitdev given the similarity with scala's syntax in that PEP, you may find scala's syntax for higher-kinded types useful
IMO Scala's HKT syntax fits well with PEP 695, though, as Eric has mentioned, HKTs are rather complex to implement type-checker-wise.
Then the monad would look something like this:
from typing import Callable, Protocol
class Functor[F[_]](Protocol):
def map[T, U](self: F[T], function: Callable[[T], U]) -> F[U]:
...
EDIT: renamed to Functor[T]
That syntax approach looks good to me. A variation that may be worth considering is to use ...
rather than _
.
The _
syntax is a bit more extensive than ...
; for example, T[_, _]
is a HKT with two generic types.
This essentially would allow an arbitrary amount of generic types in HKTs.
I'm not fully sure how implementing the aforementioned Functor
would look like. Open for discussion here, I guess.
I was suggesting T[..., ...]
for an HKT with two generic types. The character sequence ...
is already a defined token in Python that generally means "this is a stand-in for something that will be supplied elsewhere". For example, consider how it's used in a Callable
type annotation or a ParamSpec
specialization.
By contrast, the character _
is not a dedicated token but an identifier, and it typically means "I don't care about the value of this identifier". PEP 634 did extend _
to mean a wildcard pattern, so maybe it's OK to extend its meaning for this use case as well, but ...
strikes me as a closer fit in terms of established semantics.
Alright, I agree that ...
might be nicer (and cuter, for that matter). Back to implementing Functor
, though:
from attrs import frozen
@frozen()
class Option[O[...]](Functor[T]):
value: Optional[T]
def map[U](self: O[T], function: Callable[[T], U]) -> O[U]:
value = self.value
if value is None:
return self
return type(self)(function(value)) # preserve types, allowing to inherit from `Option[T]`
I feel like this should make sense here.
That's an interesting problem: Scala and Haskell don't have subclassing of concrete classes. Do any languages have that and higher-kinded types? What about higher-kinded types and covariance? Maybe it's no different to interface inheritance with HKTs
BTW I don't think you'd parametrize over O[...]
in Option
. There would be a number of problems with that, e.g. how would you type hint it (Option[Option[Option ad infinitum], int]
)? You might drop O[...]
in that line and use Self
but then given that might you use Self
in Functor
? How does Self
fit into all this?
also are PEP details on topic for this thread? I don't want to spam people
On that idea, we could throw things away all together and allow parametrizing Self
instead.
For instance, the Functor would be as follows:
class Functor[T](Protocol):
def map[U](self, function: Callable[[T], U]) -> Self[U]:
...
@frozen()
class Option(Functor[T]):
value: Optional[T] = None
def map[U](self, function: Callable[[T], U]) -> Self[U]:
value = self.value
if value is None:
return self
return type(self)(value)
I think being able to parameterise Self only solves part of the problem, it would still leave functions like all untypeable
Could you by any chance give an example?
Anywhere Self
isn't defined, like a free function
def map[F[...]: Functor, A, B](x: F[A], f: Callable[[A], B]) -> F[B]:
return x.map(f)