morganstanley / hobbes

A language and an embedded JIT compiler

Home Page:http://hobbes.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HasField

adam-antonik opened this issue · comments

I'd like to write instances for records which have a specific field with a specific type. HasField seems to be what I want. If I write

thing1 = {name="Steve", getValue=\().3} 

instance (Add a b c, HasField 1 x "getValue" (() -> a) ()) => Add x b c where
 l + r = l.getValue() + r

then this seems to work, thing1 + 1 evaluates to 4. If I try something more complicated though...
Suppose I define

class Op2 a c e b d f | b -> a, d -> c, f -> e, b d -> f where
  op2 :: ((a,c)->e, b, d) -> f

instance (Add a b c, Op2 a b c d e f) => Add d e f where
  x + y = op2((+), x, y)
instance (Multiply a b c, Op2 a b c d e f) => Multiply d e f where
  x + y = op2((*), x, y)

The idea is that defining how to do a binary operation then defines + and *, so

instance Op2 int int int [int] [int] [int] where
  op2 op l r = mapC(\p.op(p.0, p.1), zip(l, r))

then lets us write [1,2] + [3,4] * [10]

But if I now try to do this

thing2 = {getList=\().[3]}
instance (Op2 a b c d e f, HasField 1 x "getList" (() -> d) ()) => Op2 a b c x e f where
  l + r = l.getList() + r 

and say thing2 + [1], then the interpreter segfaults, in some infinite loop in findBidiMatches.
Interestingly, just having the above definition now also makes thing1 + 1 segfault in the same way. So I'm doing something horribly wrong, but not too sure exactly what.

That's a useful way to overload arithmetic, I've done that as well.

These kinds of segfaults can definitely happen by design, similar to UndecidableInstances in Haskell. I have hit this a few times as well, generally it comes down to an unbound recursion between type classes.

I don't think that it's a loop in findBidiMatches (at least that's not what I see with the shell in lldb just now) but instead in refine during type inference where it tries to refine and dismiss type constraints.

I can add a feature to trace the constraints it's trying to resolve, then we should see in this case that there's a cycle (it tries to solve constraint A by solving constraint B which tries to solve A ...). I don't see offhand where that's happening here but will come back when I have a bit more time.

For what it's worth, there's also a shorthand for the "HasField" constraint that's a little bit more convenient:

$ cat foo.qq

foo :: (o.getValue :: (() -> a), Multiply int a b) => o -> b
foo o = 42 * o.getValue()

$ hi -s foo.qq
> foo({getValue=\_.7})
294

Ah, that shorthand is much nicer.
I'm not seeing findBidiMatches either now, which is odd; in any case having a debug print of constraint resolution would be helpful.

On a potentially related note, would you mind expand on what

fieldValue(s) :: (s/lbl :: x) => x

is up to?

Sure, that's another syntax for the same constraint (so these are all different ways to write the HasField constraint). The difference is that with the (o.lbl :: T) => constraint syntax, you're making HasField _ _ "lbl" _ _ (i.e.: "lbl" is the literal name of the field you want to read), whereas with the (o/lbl :: T) => constraint syntax, you're making HasField _ _ lbl _ _ (i.e.: "lbl" is a variable that might resolve to different literal strings in different contexts).

A simple example is where we want to "lift" all fields up through "maybe" types (you can do the same for array types). A less abstract example is where we have some large universe of data (like a structured log file) and then different types to give a "biased view" into that data (e.g. we just want to be able to see information relevant to one specific order).

(One small detail, the "HasField" constraint is one level of abstraction up because we have other kinds of constraints that want to hook into it besides type classes, but the type class hook is called "SLookup" for "static lookup" since the field to lookup is known statically).

So an example might look like:

$ cat fields.hob

// lift any property through "maybe" values
instance (o/f::t) => SLookup (()+o) f (()+t) where
  slookup o = mapm(fieldValue::(_/f::_)=>_, o)

// some global data (e.g. an application log)
m = [(o, {x = 1, y = "order #"++show(o), z = 3.14159*o, w = [1..10], u = [{x=i,xx=i*i}|i<-[o..o+10]]}) | o <- [0..100]]

// some partial view of global data (e.g. just order-specific data)
data Order = int

lookupOrder :: int -> (()+Order)
lookupOrder x = unsafeCast(just(x))

toID :: Order -> int
toID = unsafeCast

// make it possible to view the order projection of global data
instance (TypeOf `m` [_*mt], mt/lbl :: ft) => SLookup Order lbl ft where
  slookup o = (fieldValue::(_/lbl::ft)=>_)(m[toID(o)].1)

$ ./hi -s fields.hob
> lookupOrder(42).u
 x   xx
-- ----
42 1764
43 1849
44 1936
45 2025
46 2116
47 2209
48 2304
49 2401
50 2500
51 2601
52 2704

This last expression "looks" pretty straightforward, but it's calculated first by going through the first field overload that's generic for "maybe" types, and then within that going through the next field overload that takes a field access on an "Order" to be a lookup in this global table.

I hope that explains the feature well enough -- I've lost several people at that explanation because (I think) it confuses them, but it's actually a very useful feature that we do use in a lot of different ways.

And maybe a simpler example just with arrays:

$ cat afields.hob

instance (t/f::ft) => SLookup [t] f [ft] where
  slookup ts = map(fieldValue::(_/f::_)=>_, ts)

$ hi -s afields.hob
> sqs = [{x=i, xx=i*i} | i <- [0..100]]
> sqs.xx
[0, 1, 4, 9, 16, 25, 36, ...]

By the way, out of curiosity, why not do your arithmetic lift like this?

$ cat arith.hob

instance (Add a b c) => Add [a] [b] [c] where
  xs + ys = [x + y | (x,y) <- zip(xs,ys)]
instance (Add a b c) => Add [a] b [c] where
  xs + y = [x + y | x <- xs]
instance (Add a b c) => Add a [b] [c] where
  x + ys = [x + y | y <- ys]

instance (Multiply a b c) => Multiply [a] [b] [c] where
  xs * ys = [x * y | (x,y) <- zip(xs,ys)]
instance (Multiply a b c) => Multiply [a] b [c] where
  xs * y = [x * y | x <- xs]
instance (Multiply a b c) => Multiply a [b] [c] where
  x * ys = [x * y | y <- ys]

$ hi -s arith.hob
> (3 * [1..10]) + ([1..10] + 2)
[6, 10, 14, 18, 22, 26, 30, 34, 38, 42]

Hi adam,

I created a pull request to add an option to the shell to allow you to see constraints as they're resolved (#344). This was enough to see what was going on in this case, basically your Op2 constraint resolution doesn't terminate. It decides that it can make an Op2 instance for a set of type variables iff it can make an Op2 instance for a set of type variables -- which this option will make evident by printing out each of the Op2 constraints it tries to resolve.

Hopefully this feature makes it easier to understand how your type class resolution happens (or why it diverges if it does).

With that generic feature aside (hopefully it helps reduce frustration in the future at least), it looks like the problem is there in the bit that you added at the end (like you said):

thing2 = {getList=\().[3]}
instance (Op2 a b c d e f, HasField 1 x "getList" (() -> d) ()) => Op2 a b c x e f where
  l + r = l.getList() + r 

Actually this definition should have been rejected straight away, because it purports to introduce a scheme for making Op2 instances, but it just supplies an implementation of a + function (which is not a member of the Op2 class). So that's a small improvement that could be made just to reject this definition earlier.

But suppose that you did intend for this to generate new Op2 instances, then you can tell immediately that this definition introduces an infinite loop because its variables aren't constrained and aren't decreasing in size before evaluating a conditioned Op2 guard (which will itself use this same instance generated recursively to decide if it is satisfied, etc until stack overflow).

I think that what you actually want at the tail of this script is an Add instance generator conditioned on Op2 (which is totally safe), and another Add instance generator in case of a type with that getList function (which is also totally safe).

If I replace that bit of your script with this instead:

instance (Op2 l r t [l] [r] x) => Add [l] [r] x where
  l + r = op2((+), l, r)

thing2 = {getList=\().[3]}
instance (l.getList::(() -> t), Add t r x) => Add l r x where
  l + r = l.getList() + r

Then I'm able to have what I think are your expected interactions:

> [1,2,3] + [4,5,6]
[5, 7, 9]
> {getList=\().[1,2,3]} + [4,5,6]
[5, 7, 9]

I hope that this has helped. :)

By the way, out of curiosity, why not do your arithmetic lift like this?

I certainly started like you did above, but then I had to duplicate the logic for another type, and that's a lot of boiler-plate to rewrite. I also wasn't really happy with the mixing of the broadcasting and arithmetic logic. By using the slightgly odd Op2 class, we avoid both those issues. Lifting arithmetic to a new type we just need to define two new instances, op2, and op1, its unary cousin, rather than the 12 for +,-,*,/, plus handful of others.

Thanks for the code in the PR, that looks very helpful.
You also gently point out I'm an idiot and defined the wrong class instance function. Sorry about that :)

It looks like I can do what I want with

instance (o.getList :: (() -> [int])) => Convert o [int] where
  convert x = x.getList()

instance (x != d, Convert x d, Op2 a b c d e f) => Op2 a b c x e f where
  op2 op l r = op2(op, convert(l) :: d, r)

Actually, that's not exactly right, it works for the left variant as above, but not on the other case with getList on the right. Strange.

If I define

 Convert int [int] where
   convert x = [x]

instance (x != d, Convert x d, Op2 a b c d e f) => Op2 a b c x e f where
  op2 op l r = op(convert(l) :: d, r)

instance (x != e, Convert x e, Op2 a b c d e f) => Op2 a b c d x f where
  op2 op l r = op(l, convert(r) :: e)

then 1 + [1] and [1] + 1 work. You mentioned above that as this was unguarded it would necessarily loop, but I don't know see the difference between the record thing and the int to [int] promotion, and why the left variant should work and the right not.

Ok, so this seems to work

instance (Convert x d) => Op2 a b c x e f where
  op2 op l r = op2(op, convert(l), r)

instance (Convert x e) => Op2 a b c d x f where
  op2 op l r = op2(op, l, convert(r))

We don't need to assert a pre-existing instance of Op2, just show we can create something which unification can sort out.

One more question on deconstructing records in instances. Suppose I write

class Foo x where
 foo :: x -> x

instance (x={int*t}) => Foo x where
 foo x = recordAppend({a=1}, recordTail(x))

foo({a=10}) then returns {a=1}, but foo({a=10, b=2}) complains that "Cannot unify types: { .pa:int, b:int } != { b:int }", why does recordTail seem to have a '.pa' in it's return type? This isn't what one sees just typing

> x = {a=10, b=2}
> recordAppend({a=1}, recordTail(x))
{a=1, b=2}

in the repl.

I see, Op2 helps you avoid that boilerplate. But are you really filling out Op2 with each primitive type? The way I wrote those array lifts for Add, it'll work with any nesting of arrays (e.g. [[1,2], [3,4]] + [5,6] == [[6,7], [9,10]]). It is a good idea to avoid writing all of that same code for each operator anyway.

On the recordTail issue, that's a very good question and there's an interesting story here. Let me turn your question into a question (it's what I asked myself when I implemented this part originally).

Do you think that recordTail should actually allocate and construct a new record, or should it just give a different view into the same memory for the already-constructed record?

If you think that it should allocate and construct, then reasonable uses of recordTail to generically deconstruct a record type suddenly become much more expensive. For example, the Show instance for records uses this to walk through the fields one by one to recursively show them. If recordTail did allocate and construct, then showing a record of n fields would take O(n^2) time and space.

If you think that it shouldn't allocate and construct, then you have to come up with a description of the "tail" type that is consistent with requirements for memory layout and alignment (that fields are placed at offsets aligned to their type). For example, what's the tail type of {x:int, y:int, z:double}? If it's just {y:int, z:double}, then we'd expect to see 4 bytes of padding between y and z (besides that the field offset y=0 is not right, though z=8 is right). But we don't need padding there, because there was a leading 4 bytes for x that we've hidden with recordTail.

So I went the route of not allocating in recordTail and doing this thing with the record type to hide the fields in the prefix of the type passed by recordTail. It works well for the initial problem I was worried about (efficiently deconstructing record types generically) but it starts an awkward conversation when it creeps through like this.

Maybe there's another good way to do it efficiently, I'd be happy to consider it.

HOWEVER, all of that aside, I bet you're asking this because you want to also lift arithmetic operations into tuples/records. I have done this, working around the funny business with record deconstruction, in two passes -- first (at compile time) calculating the result type, and then second (at run time) to allocate and modify the result structure.

Here's what I mean:

$ cat rec-arith.hob

class AddRecordType l r x | l r -> x
instance AddRecordType () () ()
instance (l={lbl:a*lt}, r={lbl:b*rt}, Add a b c, AddRecordType lt rt xt, {lbl:c*xt}=x) => AddRecordType l r x

class (AddRecordType l r x) => AddRecordInto l r x where
  addRecordInto :: (l, r, x) -> ()
instance AddRecordInto () () () where
  addRecordInto _ _ _ = ()
instance AddRecordInto l r x where
  addRecordInto l r x = do { recordHeadValue(x) <- recordHeadValue(l) + recordHeadValue(r); addRecordInto(recordTail(l), recordTail(r), recordTail(x)); }

instance (AddRecordInto l r x) => Add l r x where
  l + r = do {x = newPrim(); addRecordInto(l, r, x); return x}

$ hi -s rec-arith.hob
> {x=42, y=3.14159} + {x=2L, y=22}
{x=44, y=25.1416}

Granted, the action behind the scenes is not the prettiest, but it does get a very nice result (being able to add together records whose shapes line up and whose types can be added pointwise across the record). And it composes with all of the other ways you have of overloading arithmetic (so e.g. lifting record addition into arrays, or addition of records with arrays for fields).

If you read this closely you'll find that the most important bit is in calculating the type where we have this special "backward mode" use of the "record deconstruction" constraint (written {lbl:c*xt}=x there, instead of as usual x={lbl:c*xt}). I added that option exactly for this situation (though I have tried to only add new features as a last resort).

Hope this helps.

Don't worry, I'm not sitting here writing instances for each prim type. In this case, I had a timeseries type where I wanted to replicate the logic on streams.
You can also do the [[1,2],[3,4]] + [5,6] logic using the op2 approach as well, I'm pretty sure it's no less useful than the normal way, and results in a bit less code overall.

As for records, I was actually trying to make a copy of a new record with a single field modified. Obviously I'm not about to write out each individual other name out by hand, there's six of them! So I was having a look at doing this in a generic way. I can probably do this in a very similar way to what you set out, above, and in the other comment on fieldvalue. A lens library for hobbes would be nice.

A lens library is a good idea, we've got algebraic data types so it does little harm to introduce calculus (by way of infinitesimal types). We already have structural types by default, so some of the awkwardness of Haskell lenses (IMHO) can be avoided.

I agree with you that it's probably the right idea to differentiate (no pun intended) a record type and a focus into it. We just need to do it in a way that's minimal for space/time (if there's an unnecessary cost, I've found that people will hack around it).

Generic deconstruction of sum types has a similar problem, but the try/fail approach with prisms (IIRC) is a disaster. For example, when you've got a billion variants and you want to countBy their enum truncation (just counting by unique constructors), the definition of ordering and equality this way looks like "try the first constructor, fail, try the second constructor, fail, etc" and it takes forever compared to "switch on actual constructor, dispatch to associated handler". This did actually come up at one point and using the more efficient approach to generic variant deconstruction made the difference between that query being impractical and it being almost instantaneous.

FWIW, the path syntax is a pretty useful shorthand for consecutive field selection, for example:

$ hi -s
> map( .x.y.z, [{a=1,b=2,x={c=3,d=4,y={e=5,f=6,z=i,g=7},h=8,i=9},j=10,k=11}|i<-[0..10]] )
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

For the kind of updating it sounds like you have in mind, I made a test script based on this same two-stage approach (calculating the result type, then calculating the merge):

$ cat rmerge.hob

class UpdateField r f x o | r f x -> o
instance UpdateField () f x ()
instance (r={f:_*rt}, {f:x*rt}=o) => UpdateField r f x o
instance (r={nf:a*rt}, UpdateField rt f x nrt, {nf:a*nrt}=o) => UpdateField r f x o

class RMergeTy l r x | l r -> x
instance RMergeTy l () l
instance (r={f:t*rt}, UpdateField l f t nl, RMergeTy nl rt x) => RMergeTy l r x

class RMergeInto l r x where
  rmergeInto :: (l,r,x) -> ()
instance RMergeInto _ _ () where
  rmergeInto _ _ _ = ()
instance (x={f:h*xt}, r/f::h, RMergeInto l r xt) => RMergeInto l r x where
  rmergeInto l r x = do { recordHeadValue(x) <- (fieldValue::(r/f::h)=>_)(r); rmergeInto(l,r,recordTail(x)); }
instance (x={f:h*xt}, l/f::h, RMergeInto l r xt) => RMergeInto l r x where
  rmergeInto l r x = do { recordHeadValue(x) <- (fieldValue::(l/f::h)=>_)(l); rmergeInto(l,r,recordTail(x)); }

rmerge :: (RMergeTy l r x) => (l, r) -> x
rmerge l r = do { x=newPrim(); rmergeInto(l, r, x); return x }

So that I get an interaction that I think is similar to what it sounds like you're looking for:

$ hi -s rmerge.hob
> rmerge({x=1,y=2,z=3}, {z=0xdeadbeef,x=[0..10]})
{x=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], y=2, z=0xdeadbeef}

To get that script to work, I had to fix a small bug (in this PR: #345). Just FYI in case you were going down that path, you might want this fix.

Very nice!
It would be cool if we could make modify({a=1}, .a, \x.x+1) == {a=2} using rmerge, but that would need a function f so that f(.a, v) = {a=v}, not sure this is possible without some c++ level additions.

Actually, I should be able to modify the rmerge logic to write write that in the form modify({a=1}, {a=\x.x+1}) == {a=2})

Yep, we can add

class RModifyTy l r x | l r -> x
instance RModifyTy l () l
instance (r={f:(t -> a) *rt}, UpdateField l f a nl, RModifyTy nl rt x) => RModifyTy l r x

class RModifyInto l r x where
  rmodifyInto :: (l,r,x) -> ()
instance RModifyInto _ _ () where
  rmodifyInto _ _ _ = ()

instance (x={f:h*xt}, r/f::(a->h), RModifyInto l r xt) => RModifyInto l r x where
  rmodifyInto l r x = do { recordHeadValue(x) <- ((fieldValue::(r/f::(a->h))=>_)(r))(recordHeadValue(l)); rmodifyInto(l,r,recordTail(x)); }

instance (x={f:h*xt}, l/f::h, RModifyInto l r xt) => RModifyInto l r x where
  rmodifyInto = rmergeInto

rmodify :: (RModifyTy l r x) => (l, r) -> x
rmodify l r = do { x=newPrim(); rmodifyInto(l, r, x); return x }

and then rmodify({a=1}, {a=(\x.x+1) :: int -> int}) == {a=2}. Shame we need the explicit type annotation, but this is cool. And deeper path based modification won't work easily like this.

Doing that without type annotations is an interesting problem. It's come up before in other ways. The problem is that we've got to communicate to the 'Add' constraint that the first argument to 'RModifyTy' is fixing its second argument. This is basically function overloading (with a record of functions acting like a function).

Solving this would also probably solve the currying problem, where we want to calculate closure types from partial applications with a type class like this.

Yes, having looked through the code a bit, the currying problem does seem a bit of a poser.
In the meantime I continue my doomed quest to write modify({a=1}, .a, \x.x+1). We can sort of retrieve a record from the accessor:

data M a b = ()

instance (o/f::t, {f:t*()}=r) => SLookup (M o t) f r where
  slookup o = unsafeCast(newPrim() :: {f: t})

(unsafeCast(()) :: ((M {a:int} int))).a
{a=0}

Putting these previous bits together, I guess you could make something kind of like what you describe this way (using a phantom type to carry the intended modification field):

$ cat rupd.hob

class UpdateField r f x o | r f x -> o
instance UpdateField () f x ()
instance (r={f:_*rt}, {f:x*rt}=o) => UpdateField r f x o
instance (r={nf:a*rt}, UpdateField rt f x nrt, {nf:a*nrt}=o) => UpdateField r f x o

class RWithUpdate l f x o where
  rwithUpdate :: (l,x,o) -> ()
instance RWithUpdate _ _ _ () where
  rwithUpdate _ _ _ = ()
instance (o={f:x*ot}) => RWithUpdate l f x o where
  rwithUpdate l x o = do { recordHeadValue(o) <- x; (rwithUpdate::(RWithUpdate l f x _)=>_)(l, x, recordTail(o)); }
instance (l/nf::h, o={nf:h*ot}, RWithUpdate l f x ot) => RWithUpdate l f x o where
  rwithUpdate l x o = do { recordHeadValue(o) <- (fieldValue::(l/nf::h)=>_)(l); (rwithUpdate::(RWithUpdate l f x _)=>_)(l, x, recordTail(o)); }



data Update r = r
update :: r -> (Update r)
update = unsafeCast
updateR :: ((Update r)) -> r
updateR = unsafeCast

data UpdateAt r f = r
updateAt :: r -> (UpdateAt r f)
updateAt = unsafeCast
updateAtR :: ((UpdateAt r f)) -> r
updateAtR = unsafeCast

instance (r/f::_) => SLookup (Update r) f (UpdateAt r f) where
  slookup ur = updateAt(updateR(ur))



class (r/f::a, UpdateField r f b out) => UApply r f a b out where
  uapply :: ((UpdateAt r f), a -> b) -> out
instance UApply r f a b out where
  uapply ur f = do { o=newPrim()::out; (rwithUpdate::(RWithUpdate r f b out)=>_)(updateAtR(ur), f((fieldValue::(r/f::_)=>_)(updateAtR(ur))), o); return o }

Then with that set of definitions in place, you can have an interaction like this:

$ ./hi -s ./rupd.hob
> test = {x=1, y=2, z=3}
> uapply(update(test).y, \x.x+40)
{x=1, y=42, z=3}

Hope this helps! :)

Also if you don't like that syntax for updating and you prefer your u(record, field, func) syntax, if you've loaded that rupd.hob script above, then you can get this pretty easily as:

> u = \r f uf.uapply(f(update(r)), uf)

> u({x=1,y=2,z=3}, .y, \n.n+40)
{x=1, y=42, z=3}

HTH!

The wife thanks you for removing my excuse to play with this this weekend...

But thanks for this. I have no evidence for this at the moment, but it seems using the same object to get and set values would be useful (the fact we no longer needed explicit type annotations on the update function is an indication this is more the right way to do this) . If I get some time next week, I'll look at getting paths to work in this. This already starts to give a lot of the utility of a proper lens library in operating with records. If only we had a WW based parser to allow us to define custom operators.

[There's something very pleasing about such type-level hackery where over half the lines of actual code involve unsafeCast or newPrim]

Oops, sorry, you'd said that your goal was doomed so I took it as a challenge. If you're still looking for weekend plans, extending it to deeper paths is a good idea, or applying the idea to variants and recursive types could be useful. Maybe better to spend time with your wife though, if she's that annoyed with you. :)

I agree with you that it's useful to be able to use the same path to read and update a field. I never did get really into lenses in Haskell, maybe I should look at that more closely. I was very interested in Conor McBride's derivative types though (a related idea).

It's an interesting idea to turn the parser generator back on the compiler itself. I wanted to do that 6 years ago or so, but had a lot of momentum against that -- plus several people are afraid of the idea (they already have a hard time with the fixed syntax of hobbes). If the term and type grammars are reflected as regular data structures, then a parser could just be a function producing that type. Maybe if we have a use-case that demands it.

I'd like to hide things like unsafeCast and newPrim, but at least where they're used here it's a good sign that we'll see minimal runtime overhead.

I seem to have run into an internal bug on this.

If we take the easy case of non-polymorphic path based modification (if we allow the modifier to change the type I don't yet see how to construct the final type, which I seem to need), and modify your uapply function above to take closures instead, we get overall

class UpdateField r f x o | r f x -> o
instance UpdateField () f x ()
instance (r={f:_*rt}, {f:x*rt}=o) => UpdateField r f x o
instance (r={nf:a*rt}, UpdateField rt f x nrt, {nf:a*nrt}=o) => UpdateField r f x o

class RWithUpdate l f x o where
  rwithUpdate :: (l,x,o) -> ()
instance RWithUpdate _ _ _ () where
  rwithUpdate _ _ _ = ()
instance (o={f:x*ot}) => RWithUpdate l f x o where
  rwithUpdate l x o = do { recordHeadValue(o) <- x; (rwithUpdate::(RWithUpdate l f x _)=>_)(l, x, recordTail(o)); }
instance (l/nf::h, o={nf:h*ot}, RWithUpdate l f x ot) => RWithUpdate l f x o where
  rwithUpdate l x o = do { recordHeadValue(o) <- apply((fieldValue::(l/nf::h)=>_), l); (rwithUpdate::(RWithUpdate l f x _)=>_)(l, x, recordTail(o)); }
data Update r = r
update :: r -> (Update r)
update = unsafeCast
updateR :: ((Update r)) -> r
updateR = unsafeCast

data UpdateAt r f = r
updateAt :: r -> (UpdateAt r f)
updateAt = unsafeCast
updateAtR :: ((UpdateAt r f)) -> r
updateAtR = unsafeCast

instance (r/f::_) => SLookup (Update r) f (UpdateAt r f) where
  slookup ur = updateAt(updateR(ur))


class (r/f::a, UpdateField r f b out) => UApply r f a b out where
  uapply :: ((UpdateAt r f), exists E.(((E, a) -> b) * E)) -> out
instance UApply r f a b out where
  uapply ur f = do { o=newPrim()::out; (rwithUpdate::(RWithUpdate r f b out)=>_)(updateAtR(ur), apply(f, apply((fieldValue::(r/f::_)=>_), updateAtR(ur))), o); return o }



m = \f uf.\r.uapply(f(update(r)), uf)

data M fr r k pr = r

instance (r/f::a) => SLookup (M fr r k p) f (M fr a f (M fr r k p)) where
  slookup o = unsafeCast((fieldValue :: (r/f::a)=>_)(unsafeCast(o) :: r))


class PathBuild a b c | a b -> c where 
  build :: (a, b) -> c

instance (Function f c r r) => PathBuild (M r r () ()) f (exists E.(((E, r)->r) * E)) where
  build w f = f

instance (Function g c t t) => PathBuild (M fr t k (M fr r nk np)) g (exists E.(((E, fr) ->fr) * E)) where
  build w uf = build(unsafeCast(w) :: ((M fr r nk np)), m(fieldValue :: (_/k::_) => _, uf) :: exists E.(((E, r)->r) * E))

pathBuilderRoot :: r -> ((M r r () ()))
pathBuilderRoot = unsafeCast

appM :: ((a -> b), (M _ a _ _)) -> exists E.(((E,a) -> b) * E)
appM v w = \_.v(unsafeCast(w))

pm = \r f uf.let v = f(pathBuilderRoot(r)) in apply(build(v, appM(uf, v)), r)

(yes, the naming sucks). Now if we try pm({a=1,b={c=5}}, .a, \x.x+1) we get {a=2, b={c=5}}, and if we try pm({a=1, b={c=5}, .b.c, \x.x+1) we do indeed get {a=1, b={c=6}}. But if we run both, one after each other, in either order, the second call will fail with

LLVM ERROR: Cannot select: 0x55fb2a4de2e0: i64 = bitcast 0x55fb2a4debd0
  0x55fb2a4debd0: i32,ch = CopyFromReg 0x55fb28a5b800, Register:i32 %0
    0x55fb2a4b5700: i32 = Register %0
In function: .rfn.t23088

If we instead have data M a b c d = () and write build(v, toClosure(uf)) then we don't get this issue, but now have to give an explicit type annotation to \x.x+1. However I don't see that I've done anything illegal to the contents of M as above. Actually, perhaps I have in build, when constructing the chain of modifications

Altering M so we can go up and down
data M fr r k pr = {l:r, r:pr}

pathBuilderRoot :: r -> ((M r r () ()))
pathBuilderRoot v = unsafeCast({l=v, r=()})

appM :: ((a -> a), (M _ a _ n)) -> exists E.(((E,a) -> a) * E)
appM v w = \_.v((unsafeCast(w) :: {l:a, r:n}).l)

Then
> :t pathBuilderRoot({a=1})
(M { a:int } { a:int } () ())
> appM(\x.x+1, pathBuilderRoot({a=1}).a)
Struct member 'l' does not exist in {}
> :t pathBuilderRoot({a=1})
Struct member 'l' does not exist in {}

What's going on here?

Hmm, I'll have to look at that closely to see where the LLVM error gets through. It's definitely a bug if that happens.

Taking a step back, the uapply definition I gave before is able to work without user type annotations (and works with polymorphic update, changing a field type) because it gets all of the necessary bits together at once for unification. That's the key step, and everything follows from that:

... (r/f::a, UpdateField r f b out) => ...
  uapply :: ((UpdateAt r f), a -> b) -> out

So we're already very close to something that works for paths deeper than 1 step and for other function types. This is kind of like the "zero" case (do a 1 step update in a record), and we just need a "successor" case to do an update in a field and then patch it back into the current level (which can then be applied recursively).

An easy step to make this work is to allow this "UpdateAt" to nest, so it becomes a "suspension telescope" down to the field to modify, rooted at the total record type. For example, in the record {a={b=1}} we'll want .a.b to pull back to the type UpdateAt (UpdateAt {a:{b:int}} "a") "b".

First we'll want to generalize field lookup through these suspensions:

class UProjType u t | u -> t where
  uproj :: u -> t
instance (r={_*_}, r/f::t) => UProjType (UpdateAt r f) t where
  uproj u = (fieldValue::(r/f::t)=>_)(updateAtR(u))
instance (UProjType s r, r/z::t) => UProjType (UpdateAt s z) t where
  uproj u = (fieldValue::(r/z::t)=>_)(uproj(updateAtR(u)))

Then we can make these suspensions to any depth by adding one case to the SLookup constructor we had before:

instance (r/f::_) => SLookup (Update r) f (UpdateAt r f) where
  slookup ur = updateAt(updateR(ur))
instance (UProjType (UpdateAt r f) rt, rt/s::_) => SLookup (UpdateAt r f) s (UpdateAt (UpdateAt r f) s) where
  slookup ua = updateAt(ua)

And if we add a replacement within suspensions:

class UReplace u t out | u t -> out
instance (r={_*_}, UpdateField r f t out) => UReplace (UpdateAt r f) t out
instance (UProjType (UpdateAt u s) ur, UpdateField ur z t nur, UReplace (UpdateAt u s) nur out) => UReplace (UpdateAt (UpdateAt u s) z) t out

Then we can add a case to uapply to tie it together:

class (UProjType u a, Function tfn tfnc a b, UReplace u b out) => UApply u tfn tfnc a b out where
  uapply :: (u, tfn) -> out
instance (r={_*_}) => UApply (UpdateAt r f) tfn _ a b out where
  uapply u tfn = do { o=newPrim()::out; (rwithUpdate::(RWithUpdate r f b out)=>_)(updateAtR(u), apply(tfn, (fieldValue::(r/f::_)=>_)(updateAtR(u))), o); return o }
instance (UProjType (UpdateAt u s) ur, UpdateField ur z b nur) => UApply (UpdateAt (UpdateAt u s) z) tfn _ a b out where
  uapply u f = do { nur = uapply((fieldValue::(_/z::_)=>_)(update(uproj(updateAtR(u)))), f); return uapply(updateAtR(u), \_.nur) }

So with this final script (adding your m definition at the end):

$ cat rpath-upd.hob

class UpdateField r f x o | r f x -> o
instance UpdateField () f x ()
instance (r={f:_*rt}, {f:x*rt}=o) => UpdateField r f x o
instance (r={nf:a*rt}, UpdateField rt f x nrt, {nf:a*nrt}=o) => UpdateField r f x o

class RWithUpdate l f x o where
  rwithUpdate :: (l,x,o) -> ()
instance RWithUpdate _ _ _ () where
  rwithUpdate _ _ _ = ()
instance (o={f:x*ot}) => RWithUpdate l f x o where
  rwithUpdate l x o = do { recordHeadValue(o) <- x; (rwithUpdate::(RWithUpdate l f x _)=>_)(l, x, recordTail(o)); }
instance (l/nf::h, o={nf:h*ot}, RWithUpdate l f x ot) => RWithUpdate l f x o where
  rwithUpdate l x o = do { recordHeadValue(o) <- (fieldValue::(l/nf::h)=>_)(l); (rwithUpdate::(RWithUpdate l f x _)=>_)(l, x, recordTail(o)); }



data Update r = r
update :: r -> (Update r)
update = unsafeCast
updateR :: ((Update r)) -> r
updateR = unsafeCast

data UpdateAt r f = r
updateAt :: r -> (UpdateAt r f)
updateAt = unsafeCast
updateAtR :: ((UpdateAt r f)) -> r
updateAtR = unsafeCast

class UProjType u t | u -> t where
  uproj :: u -> t
instance (r={_*_}, r/f::t) => UProjType (UpdateAt r f) t where
  uproj u = (fieldValue::(r/f::t)=>_)(updateAtR(u))
instance (UProjType s r, r/z::t) => UProjType (UpdateAt s z) t where
  uproj u = (fieldValue::(r/z::t)=>_)(uproj(updateAtR(u)))

instance (r/f::_) => SLookup (Update r) f (UpdateAt r f) where
  slookup ur = updateAt(updateR(ur))
instance (UProjType (UpdateAt r f) rt, rt/s::_) => SLookup (UpdateAt r f) s (UpdateAt (UpdateAt r f) s) where
  slookup ua = updateAt(ua)


class UReplace u t out | u t -> out
instance (r={_*_}, UpdateField r f t out) => UReplace (UpdateAt r f) t out
instance (UProjType (UpdateAt u s) ur, UpdateField ur z t nur, UReplace (UpdateAt u s) nur out) => UReplace (UpdateAt (UpdateAt u s) z) t out


class (UProjType u a, Function tfn tfnc a b, UReplace u b out) => UApply u tfn tfnc a b out where
  uapply :: (u, tfn) -> out
instance (r={_*_}) => UApply (UpdateAt r f) tfn _ a b out where
  uapply u tfn = do { o=newPrim()::out; (rwithUpdate::(RWithUpdate r f b out)=>_)(updateAtR(u), apply(tfn, (fieldValue::(r/f::_)=>_)(updateAtR(u))), o); return o }
instance (UProjType (UpdateAt u s) ur, UpdateField ur z b nur) => UApply (UpdateAt (UpdateAt u s) z) tfn _ a b out where
  uapply u f = do { nur = uapply((fieldValue::(_/z::_)=>_)(update(uproj(updateAtR(u)))), f); return uapply(updateAtR(u), \_.nur) }

m = \f uf.\r.uapply(f(update(r)), uf)

Then I can have this interaction (which does change the type of the update field, for what it's worth):

$ hi -s rpath-upd.hob
> apply(m(.a.b.c.d.e.f, \n.[1..n]), {x=1, y=2, a={b={c={d={e={f=10}}}}}, z=3, w=4})
{x=1, y=2, a={b={c={d={e={f=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}}}}}, z=3, w=4}

I hope that this helps!

I will have to come back to this to see what happened with that backend LLVM error ...

That does help a lot, and thanks for the step-by-step working. I'll wrap my head around this, then maybe I'll be able to write something myself without a new github issue :)

I love your github issues, the discussion has brought up interesting ideas and improvements.