ocsigen / ts2ocaml

Generate OCaml bindings from TypeScript definitions via the TypeScript compiler API

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use `include module type of` instead of `cast` for inheritance

tmattio opened this issue · comments

When encoutering type inheritance in TypeScript, we could generate code with

include module type of struct include XXX end

instead of the current cast functions.

This would make the semantics closer to what TypeScript. For instance this code:

let () =
  let file = Fs.createWriteStream ~path:"file.jpg" () in
  let (_ : Http.ClientRequest.t) =
    Https.get
      ~url:(`String "https://i3.ytimg.com/vi/J---aiyznGQ/mqdefault.jpg")
      ~callback:(fun ~res ->
        let (_ : Stream.Writable.t) = Http.IncomingMessage.pipe res file () in
        ())
      ()
  in
  ()

works only if the types Fs.WriteStream is a Stream.Stream

See #17. Also without Internal.Types you would need casts anyway when B extends A and you want pass a value of type B to a function taking an argument of type A.

Also without Internal.Types you would need casts anyway when B extends A and you want pass a value of type B to a function taking an argument of type A.

I don't think that's true, unless I misunderstood what you're saying:

module A : sig
  type t

  val pp : t -> unit

  val create : string -> t
end = struct
  type t = {
    name : string
  }

  let pp t = print_endline t.name

  let create name = {name}
end

module B : sig
  include module type of struct include A end
end = struct
  include A
end

let () =
  let (b : B.t) = B.create "test" in
  A.pp b

Compiles and prints "test" as expected

Ah, I now see the point. I was somehow assuming include module type of ... would create a new B.t having nothing to do with the original A.t, but that was not the case.

Then here is the problem I've been trying to avoid: the following compiles...

let f (_: A.t) = ()
let g (_: B.t) = ()

let test () =
  let (a: A.t) = A.create "a" in
  let (b: B.t) = B.create "b" in
  f a; (* should compile *)
  f b; (* should compile *)
  g a; (* should NOT compile *)
  g b; (* should compile *)
  ()
;;

ocaml g a; (* should NOT compile *)

Right, I thought about that as well but didn't find a solution with OCaml's module system 😕 Still better than requiring manual casting IMO, but I see how this is problematic.

Also, include module type of ... can introduce overloaded functions. Of course I can check every class and interface it inherits and add 's to the overloaded functions, but then I would rather emit everything directly to the module.

In fact, the current system can handle inheritance well: the [ .. ] intf types are for this purpose.

type -'a intf

with this, if we define the types A and B as the following:

module A = struct
  type t = [ `A ] intf
  let create _ : t = Obj.magic ()
end

module B = struct
  type t = [ `A | `B ] intf
  let create _ : t = Obj.magic ()
end

then you can use the built-in cast operator :> to cast them around, and g (a :> B.t) actually fails to compile!

let f (_: A.t) = ()
let g (_: B.t) = ()

let test () =
  let (a: A.t) = A.create "a" in
  let (b: B.t) = B.create "b" in
  f a; (* should compile *)
  f (b :> A.t); (* should compile *)
  g (a :> B.t); (* should NOT compile *)
  g b; (* should compile *)
  ()
;;
Error: Type A.t = [ `A ] intf is not a subtype of B.t = [ `A | `B ] intf 
Type [ `A | `B ] is not a subtype of [ `A ] 
The second variant type does not allow tag(s) `B

With the current implementation, if A has a method foo, you would need to call it as A.foo (b :> A.t) .... If you want to do B.foo b ..., then we are forced to compute the overloaded function names, and again we can just dump everything from A to B without using include module type of....

then you can use the built-in cast operator :> to cast them around

But then this is similar to calling cast, just with an infix operator. To be honest I don't think this results in user-friendly bindings, relying on an operator that most users won't know about for something that is transparent in TypeScript does not sound like a good idea.

Then there's the fact that I have been inlining the types and making them abstracts for all of the bindings I've generated, as I believe it will be produce a better API for end users. So relying on the poly variant is not compatible with what I have been doing.

I think it boils down to where you put the threshold of user-friendliness vs correctness, and it's the same for other API discussions in other issues (e.g. #17).

Using poly variants is indeed more correct than including other modules interfaces but at the cost of making the API less intuitive.

In my opinion, the "user-friendliness" of the API is more important that the correctness of the bindings, but can see how you would prefer the latter. Whenever there's a choice to make like this, I hope there could be an option for the "unsafe" behavior, so in this specific case, something like --unsafe-include-module-type

But then this is similar to calling cast, just with an infix operator.

I know this is a small thing, but if C extends B and B extends A, then f (B.cast (C.cast c)) vs f (c :> A.t) (cast can be overloaded so it can become worse, like f (B.cast'' (C.cast''' c))).

Then there's the fact that I have been inlining the types and making them abstracts for all of the bindings I've generated, as I believe it will be produce a better API for end users.

Wouldn't the API be acceptable, if we the followings:

  • Add every functions from module A to module B so that you can write B.foo b ... for the method foo inherited from A
  • For methods that don't belong to A but requires A as an argument, replace A.t with [< `A] intf to allow B to be passed.
    • if exposing [< `A] intf is confusing for users, we can define a helper type like A.t_inherited as below:
module A = struct
  type t = [ `A ] intf
  type 't t_inherited = 't intf constraint [> `A ] = 't
  let create _ : t = Obj.magic ()
end

module B = struct
  type t = [ `A | `B ] intf
  let create _ : t = Obj.magic ()
end

let someFunctionUsingA (a: _ A.t_inherited) = ()

let test () =
  let b = B.create () in
  someFunctionUsingA b;
  ()
;;
  • Reduce the usage of Internal.Types._TypeName as much as possible
    • I don't have any specific idea to do this but I think I can use module types for this.
    • It would be a lot of pain to do so but I think we can use recursive modules and completely remove the Internal modules.
      • I can remember Merlin had been failing to handle recursive modules properly. I don't know whether it is improved now or not...
      • Also I don't know whether recursive module signatures work well with gen_js_api, although I can write a PR to add support for it by myself.

With the above, I suppose the users would generally see the same API (especially if the Internal modules are removed using recursive modules).

I hope there could be an option for the "unsafe" behavior, so in this specific case, something like --unsafe-include-module-type

Yes, I think I can make something like "simple mode" which maximizes the user-friendliness at the cost of type safety.

That's already a big improvement in terms of the simplicity of the API IMHO!
I wish there was a way to represent subtyping on abstract types (I'm not deeply familiar with the type system, maybe there is?).

So the two alternatives would be:

type +'a intf

module A : sig
  type t = private [ `A ] intf

  type 't t_inherited = private 't intf constraint [> `A ] = 't

  val fn_a : t -> unit

  val create : 'a -> t
end

module B : sig
  type t = private [ `A | `B ] intf

  val fn_a : t -> unit

  val fn_b : t -> unit

  val create : 'a -> t
end

and

module A : sig
  type t

  val fn_a : t -> unit

  val create : 'a -> t
end

module B : sig
  include module type of struct
    include A
  end

  val fn_b : t -> unit

  val create : 'a -> t
end

With the second being simpler, but at the cost of safety. I might get convinced by the first one, but I'm still wary of the error messages, and editor signatures. If e.g. the type [ `A | `B ] intf is used in either of those instead of B.t, I'd tend to prefer the second alternative for the sake of making the API more usable.

I found that private doesn't play nice with the polymorphic variant approach:

  type -'inherits intf

  module A = struct
    type t = private [ `A ] intf
    type 't t_inherited = private 't intf constraint [> `A ] = 't
    let create () : t = Obj.magic ()
  end

  module B = struct
    type t = private [ `A | `B ] intf
    let create () : t = Obj.magic ()
  end

  let someFunctionUsingA (_a: _ A.t_inherited) = ()
  let a = A.create ()
  let b = B.create ()

  let test () =
    someFunctionUsingA a;
    someFunctionUsingA b;
    someFunctionUsingA (b :> A.t)

In the above example, none of the someFunctionUsingA .. compiles (even someFunctionUsingA a).

This expression has type A.t but an expression was expected of type
  [> `A ] A.t_inherited
This expression has type B.t but an expression was expected of type
  [> `A ] A.t_inherited
Type B.t is not a subtype of A.t

So it seems we can't really hide [..] intf from users.


I am also testing the include module type approach but I just found that Merlin just displays B.t as A.t, so users would not be able to easily distinguish B from A.

screenshot

I'm now wondering the polymorphic variant approach (without private) is rather better considering the experience users will get on Merlin, because they can at least distinguish B from A by looking at the type. Also Merlin at least shows B.t in CodeLens for some reason:

screenshot 2

If we really want to enforce Merlin to show A.t and B.t as it is, we would have to define A.t and B.t with private or as abstract types, which would imply the hell of casts.

In conclusion, I think using polymorphic variants with the improvements I described in #14 (comment) is the "least bad" option. What do you think?

Also here is a real-world example.

screenshot

I can imagine if we used include module type every type would be shown as the most parent type (in this example, members would be just shown as Ts.Node.t list) and users would struggle to find out which types (in this example, Ts.ClassElement.t) can really be used.

The fact that Merlin shows A instead of B is not specific to include module type and can be reproduced with a type equality as well:

module A : sig
  type t

  val fn_a : t -> unit [@@js.call]

  val create : 'a -> t [@@js.new "A"]
end

module B : sig
  type t = A.t

  val fn_b : t -> unit [@@js.call]

  val create : 'a -> t [@@js.new "B"]
end

For:

B.create ()

Merlin will show A.t, but when running this in the toplevel, we get _ : B.t = <abstr>. So I would say it might be a bug with Merlin and we could open an issue to see what Merlin maintainers think.

However, assuming this won't change in Merlin, I still think the include module type option produces a better API for end users. The fact that the highest parent is shown kinda makes sense and gives important information to the user on the fact that B.t is an A.t. Moreover, it is consistent with the function signatures given by Merlin, so if I define a function on B.t:

module C : sig
  val fn_c : B.t -> unit [@@js.call]
end

Merlin will show fn_c with the highest parent type:

image

The opposite would have been very confusing indeed, but the fact that Merlin is consistent on this makes it Ok IMO.

Well, I was late for the boat, but I generally agree with @cannorin.

Sadly include module type of does not work with recursive modules (ocaml/ocaml#6818). It means it would break in every case which

  • type B inherits type A, and
  • A has some method/field/property that accepts/returns B

I think I can create a "simple" mode which relaxes the emulation of subtyping (just Ojs.t instead of [ .. ] intf, for example), but I won't be doing include module type of because I don't want to generate a code which does not compile and is not easy to fix.

You would have to inline every include module type of until the compilation errors disappear.
So why don't we inline everything in the first place? Same user experiences, just larger output files...