moby / buildkit

gw.Exec should invoke a command similarily to the frontend.Exec does atm. The main difference from Solve is that Exec is never cached. Should be able to take the same arguments that llb.Exec takes (including the special mounts). More complex cases for things like ReadFile could be implemented with Exec.

The mounts of Exec should take References from completed sources as inputs.

To clarify the intent here, if you also added a feature to connect the Exec'd container's stdio to a pty, would this issue enable an "interactive" container? I.e. a container specified by LLB that spawns a usable shell/editor/other TTY-based program?

Yes, I think there is a need to make stdio accessible through Exec. Not 100% sure about the pty as it somewhat of a trap.

For stdio in regular LLB the current proposal is to use #1337 , llb.Expose(llb.Image(""), llb.Pipe(1, 0))

@hinshun

Thinking about it more, I think we definitely need the ability for the user to interact with a build result like a regular container. This is needed for debug scenarios. Maybe we don't necessarily need the frontend to have this access? And if frontend does use Exec it should show up as a line item in build output (probably with studio logs).

We previously started the conversation on Slack, but I'm paraphrasing it here to save the ideas somewhere. cc @coryb

At a high-level we want clients to be able to Exec into a container, outside of LLB and not cached.
Using this new primitive, you can then build a debugging experience on top, by Execing the result of a failed solve.
Let's begin by brainstorming about the UX around Exec, consider the following scenarios and the possible usage.

Scenario 1: Exec from regular LLB

execState := llb.Image("busybox").Run(llb.Shlex("/bin/sh"))

// Approach #1a
p := execState.ToExecProcess()
p.Run(ctx, io, tty)

// Appraoch #1b
execState.Run(ctx, io, tty)

Scenario 2: Exec into failed solve result

var st llb.State
def, _ := st.Marshal(ctx)

res, err := c.Solve(ctx, client.SolveRequest{Definition: def.ToPB()})
if err != nil {
	var er *ExecError
	if errors.As(err, &er) {
        	// Approach #2a
		c.Exec(ctx, er.Meta(), er.ErrorStateMounts(), io, tty)

		// Approach #2b
		p := ToExecProcess(er)
		p.Run(ctx, io, tty)
	}
}

Note that Approach #2b still needs Approach #2a to be implemented, #2b is an object-oriented UX slapped ontop of #2a.

How can we make this happen? (Ongoing list)

Add Exec to the LLBBridge in gateway.proto:

buildkit/frontend/gateway/pb/gateway.proto

Line 16 in 17c11d9

service LLBBridge {
Modify Executor and ExecOp to not release mounts after error, and return them inside ExecError.
- If mounts are unhandled, they are released with the gateway api release

I am hoping we can agree on a proto definition for this feature, once we have general agreement on this I should be able to work out a rough implementation.

As I see it we will need 2 grpc components:

the Exec rpc call to the LLBBridge service
IO bridge streaming service to facilitate stdio and possibly resize events

1 - Exec RPC

I think something like this will work:

service LLBBridge {
	rpc Exec(ExecRequest) returns (ExecResponse)
}

message ExecRequest {
	pb.ExecOp op = 1;
	repeated uint32 openFds = 2;
}

message ExecResponse {
    // What goes here?  Anything?  Task Exit Status?
}

I think the pb.ExecOp already has all the information we would need to implement the exec operation via the gateway. The openFds would be a list of fds to to open/close for the task. Open Fds will be connected to the IO bridge streaming service. We will need also need a tty flag, and for that I suggest we modify the pb.Meta to include that flag as:

message Meta {
	repeated string args = 1;
	repeated string env = 2;
	string cwd = 3;
	string user = 4;
	ProxyEnv proxy_env = 5;
	repeated HostIP extraHosts = 6;
	bool tty = 7;
}

since that matches the structure already available in the executor.Meta.

2 - IO bridge service

We need some way to stream input/output to the container task. I suggest creating a grpc service on the client side that can be accessed from the server via the session manager, similar to the Attachable SSHForwarder implementation. This is the proto I was thinking of:

service IOBridge {
	rpc IO(stream FdMessage) returns (stream IOInput)
}

message IOInput {
	 oneof input {
		FdMessage file = 1;
		ResizeMessage resize = 2;
	}
}

message FdMessage{
	uint32 fd = 1; // what fd the data was from
	bool eof = 2;  // true if eof was reached
	bytes data = 3;
}

message ResizeMessage{
	uint32 rows = 1;
	uint32 cols = 2;
	uint32 xpixel = 3;
	uint32 ypixel = 4;
}

Thoughts? Adjustments? Counter proposals?

how about

service LLBBridge {
  rpc NewContainer(NewContainerRequest) returns (NewContainerResponse);
  rpc ReleaseContainer(ReleaseContainerRequest) returns (ReleaseContainerResponse);
  rpc ExecProcess(stream ExecMessage) returns (stream ExecMessage);
}

message NewContainerRequest {
	string Ref = 1;
        // For mount input values we can use random identifiers passed with ref
	repeated pb.Mount mounts = 2;
	pb.NetMode network = 3;
	pb.SecurityMode security = 4
}

message ReleaseContainerRequest {
	string Ref = 1;
}


message ExecMessage {
	 oneof input {
                InitMessage init = 1;
		FdMessage file = 2;
		ResizeMessage resize = 3;
		StartedMessage started = 4;
		ExitMessage exit = 5;
	}
}

message InitMessage{
  pb.Meta Meta = 1;
  repeated uint32 fds = 2;
  bool tty = 3;
  // ?? way to control if this is PID1? probably not needed
}

message ExitMessage {
  uint32 Code = 1;
}

message FdMessage{
	uint32 fd = 1; // what fd the data was from
	bool eof = 2;  // true if eof was reached
	bytes data = 3;
}

message ResizeMessage{
	uint32 rows = 1;
	uint32 cols = 2;
	uint32 xpixel = 3;
	uint32 ypixel = 4;
}

I added "container" concept atm to support multiple exec processes. Not sure if needed initially but probably better to be safe for future ideas. The complex part of this is that it does not allow reusing the current executor directly, eg. runc needs to be invoked with create/start/exec calls instead of a single run. Or we mark one process as pid1 and then I think we can use run+exec.

Sending pb.Meta inside the giant one-off object is objectively ugly but this is grpc limitation that we can't pass the initial message on streamable endpoints (unless with unsafe context metadata).

These are the changes in the Executor go API proposed in slack

type ProcessInfo {
  Meta Meta
  Stdin io.ReadCloser
  Stdout, Stderr io.WriteCloser
  // tty, resize
}
type Executor interface {
  Run(ctx context.Context, ref string, rootfs cache.Mountable, mounts []Mount, process *ProcessInfo, allowExec true) error
  Exec(ctx, ref string, process ProcessInfo) error
}

Some investigation is needed if both runs and contained support exec without setting the pid1 process. If not then process info passed to Run is not a pointer or first exec becomes pid1.

Outline:

# Approach 1
1. (LLBBridge).NewContainer
2. (Executor).Run(process=nil, allowExec=true)
3. (LLBBridge).ExecProcess()
4. Send InitMessage(meta, fd, tty)
	4.a. Wait for StartedMessage
5. (Executor).Exec(process{meta, fd, tty})
	5.a. Send StartedMessage
6. Send/recv FdMessage/ResizeMessage
7. (LLBBridge).ReleaseContainer
	7.a. Wait for ExitMessage
8. (Executor) - cancels context
	8.a. Sends ExitMessage

# Approach 2
1. (LLBBridge).NewContainer
2. (LLBBridge).ExecProcess()
3. Send InitMessage()
	3.a. Wait for StartedMessage
4. If first InitMessage: (Executor).Run()
   Else: (Executor).Exec()
	4.a. Send StartedMessage
5. Send/recv FdMessage/ResizeMessage
6. (LLBBridge).ReleaseContainer
	6.a. Wait for ExitMessage
7. (Executor) - cancels context
	7.a. Sends ExitMessage

I think the executor changes can be done in a separate PR before the changes in proto API with some unit tests to ease the review.

Runc vs Containerd questions

These seems to be the relevant questions that might influence implementation:
a) Can we create a container without defining pid1?
b) Can we exec into container wihtout pid1 running?
c) Can we attach IO to pid1 after started?

Runc

a) No, config.json must have args where the first arg is validated to be in PATH when container is created (not started)
b) Yes, runc exec will automatically create a pid1 of runc init if nothing else is running
c) No, you cannot create then start a container and then attach IO to pid1. Further if terminal=true in config.json you cannot call runc create, only runc run.

Containerd

a) Yes, you do not have to explicitly define a pid1 (will assume defaults from image)
b) No, container must be running before exec, pid1 is required
c) Yes, IO for pid1 can be attached to after started

Proposal

So it seems for a common implementation we will have to assume "No" to all these questions.

After some discussion with @hinshun this seems like a reasonable approach:

=> NewContainer(ref, rootfs, mounts, network, security) message

server records this data (in memory, bolt?), containerd or runc is not called at this point

=> ExecProcess.InitMessage(ref, meta, IO) message

containerd:
- If first:
  - (containerd.Store).Create(container={meta + recorded data)
  - (containerd.Container).NewTask(IO)
- If !first:
  - (containerd.Task).Exec(meta, IO)
runc:
- If first:
  - (runc).Run(meta + recorded data, IO)
- If !first:
  - (runc).Exec(meta, IO)

@coryb LGTM

This is a summary of some discussion on slack.

How do we specify the inputs for `pb.Mount`?

When a solve is completed, a ref is returned.

ref, err := c.Solve(ctx, client.SolveRequest{...})

Under the hood, the proto message has a unique identifier.

message Ref {
	string id = 1;
	pb.Definition def = 2;
}

We can extend the proto for mount, so that we can reference this string id instead of an input index:

message Mount {
	# ...
	string inputRef = 6;
}

What would client usage look like? (psuedo-code)

In this scenario, we have some LLB that runs make on busybox.

rootfs := llb.Image("busybox")
src := llb.Git("git://mygit")

st := rootfs.Run(
	llb.Shlex("make"),
	llb.Dir("/in"),
	llb.Mount("/in", src),
).Root()

However, we want to run it as an exec instead of a solve.

rootfsRef, _ := c.Solve(rootfs.Marshal().ToPB())

srcRef, _ := c.Solve(src.Marshal().ToPB())

execRef := uuid.New()

c.NewContainer(client.NewContainerRequest{
	Ref: execRef,
	Mounts: []mount{
		mount{dest: "/", ref: rootfsRef},
		mount{dest: "/in", ref: srcRef},
	},
})

To make this easier, we may be able to wrap this into a method for llb.ExecState:

rootfs.Run(
	llb.Shlex("make"),
	llb.Dir("/in"),
	llb.Mount("/in", src),
).ToNewContainer()

The ToNewContainer function will be blocking as the solves are being run under the hood.

Alternatively, we could carry the LLB definition in NewContainerRequest so that the solve does not need to be client side, but we'll need the (pb.Mount).inputRef anyway for the error state scenario.

ProgressUI

It would be nice to have the progress UI to indicate the exec being called. There's two ways we thought of:

Determine whether it's long-running by scanning completion times.
Adding a request based status indicator that's sent through statusCh.

This is something done a bit higher level than the client code, but the user's display should switch between the progress UI and directing stdio to the exec process. Perhaps something like this:

[+] Building 6.6s (5/5) IDLE
 => CACHED foobar           0.0s
 => EXEC /bin/sh
/ # 
/ # (^P^Q)
...reloading
[+] Building 1.6s (5/5) IDLE
 => CACHED barbaz           0.0s
 => mkfile /msg22           0.0s
 => EXEC /bin/sh
/ #

For the client side, afaict it seems like we need to modify the Client interface in frontend/gateway/client.

I was thinking something like this:

type Client interface {
  ...
  NewContainer(ctx context.Context, containerID string, rootFs pb.Mount, opts ...ContainerOpt) error
  NewProcess(ctx context.Context, containerID string, proc executor.ProcessInfo) (status uint32, err error)
  ReleaseContainer(ctx context.Context, containerID string) error
}

type ContainerOpt func(*pb.NewContainerRequest)

The ContainerOpt would be used for adding extra non-rootfs mounts, setting securityMode, netMode:

func WithMount(pb.Mount) ContainerOpt
func WithSecurityMode(pb.SecurityMode) ContainerOpt
func WithNetMode(pb.NetMode) ContainerOpt

Does this seem like the right approach?

On further thinking, even though the code path for each process is similar, I think we need to separate the pid1 vs exec since if pid1 dies all the exec process will also die. To be consistent with the gateway api, maybe rename NewProcess to ExecProcess for theexec usage, but require an executor.ProcessInfo in the NewContainer call. So on the client side NewContainer sends the NewContainer message, and then the InitMessage to start pid1 and only return when pid1 exits. So maybe something more like this:

type Client interface {
  ...
  NewContainer(ctx context.Context, containerID string, rootFs pb.Mount, proc executor.ProcessInfo, opts ...ContainerOpt) error
  ExecProcess(ctx context.Context, containerID string, proc executor.ProcessInfo) error
  ReleaseContainer(ctx context.Context, containerID string) error
}

It also seems like return status is unnecessary since we can just wrap the error with executor.ExitError

This looks good to me. I would just change WithMount(pb.Mount) to WithMounts(...pb.Mount) and proc executor.ProcessInfo to init executor.ProcessInfo.

I'm not sure where to draw the boundary between llb cliennt and gateway client packages. Maybe gateway client should mostly copy the protobuf methods. Except instead of using the mount ID use the Reference type as it does today. Then don't need even the variadic parameters. But maybe these can be combined into one and then gateway.client can be more expressive.

The reason for having the newcontainer without starting the process was that there could be a synchronization point between build and starting the process. Eg. this is important for switching client stdio from buildkit progressbar to the attached process. I'd use the same started channel for making sure pid1 is running.

For the API that users should use I'd expect more user-friendliness. Eg.

c := container()
c.run()
c.release()

Previously we also discussed ExecState.ToContainer()

I think a minimal protobuf-ish api would look something like:

	NewContainer(ctx context.Context, req *pb.NewContainerRequest) error
	ReleaseContainer(ctx context.Context, req *pb.ReleaseContainerRequest) error
	ExecProcess(ctx context.Context, containerID string, process executor.ProcessInfo, started chan<- struct{}) error

Not sure we can get much more minimal than that.

To populate the new pb.Mount.resultID I was expecting we would modify the Reference interface to add an ID func like:

type Reference interface {
	...
	ID(ctx context.Context) (string, error)
}

So gateway client code would roughly look something like:

	c.Build(ctx, client.SolveOpt{}, "", func(ctx context.Context, gc gateway.Client) (*gateway.Result, error) {
		st := llb.Image("alpine")
		def := st.Marshal(ctx, llb.LinuxAmd64)
		res := gc.Solve(ctx, gateway.SolveRequest{
			Definition: def.ToPB(),
		})
		rootID := res.Ref.ID(ctx)
		id := identity.NewID()
		gc.NewContainer(ctx, &gateway.NewContainerRequest{
			ContainerID: id,
			Mounts: []pb.Mount{{
				Dest:      "/",
				ResultID:  rootID,
				MountType: pb.MountType_BIND,
			}},
		})
		defer gc.ReleaseContainer(ctx, &gateway.ReleaseContainerRequest{id})
		meta := executor.Meta{
			Args: []string{"/bin/sh"},
			Cwd: "/",
			Tty: true,
		}
                 gc.ExecProcess(ctx, id, executor.Process{meta, os.Stdin, os.Stdout, os.Stderr, resizeCh}, startedCh)
		return &gateway.Result{}, nil
	}, displayCh)

I was also thinking we could add any more user friendly wrappers to client.Client and/or ExecState.ToContainer() stuff a bit later as necessary. I think getting the gateway.Client will allow us to experiment a bit in HLB without immediate further changes, although I am sure a few simple patterns will emerge pretty quickly and warrant adding to buildkit.

Actually, looking at that example I don't think it makes sense to create wrapping (except ExecState) to gateway.client as that what user has access to. I'd still also prefer just using the ref object directly and leaving the string ID hidden.

wdyt?

	c.Build(ctx, client.SolveOpt{}, "", func(ctx context.Context, gc gateway.Client) (*gateway.Result, error) {
		st := llb.Image("alpine")
		def := st.Marshal(ctx, llb.LinuxAmd64)
		res := gc.Solve(ctx, gateway.SolveRequest{
			Definition: def.ToPB(),
		})
		rootID := res.Ref.ID(ctx)
		id := identity.NewID()
    
    c := gc.NewContainer(ctx, &gateway.NewContainerRequest{
      // no ID
			Mounts: []gateway.Mount{{
				Dest:      "/",
				Ref:       res.Ref,
				MountType: pb.MountType_BIND,
			}},
    })
		defer c.Release(ctx.TODO())
		meta := executor.Meta{
			Args: []string{"/bin/sh"},
			Cwd:  "/",
			Tty:  true,
		}
		err := c.Run(ctx, executor.Process{meta, os.Stdin, os.Stdout, os.Stderr, resizeCh}, startedCh)
    // could also be c.Start() c.Wait(), Start does not take startedCh then
		return &gateway.Result{}, nil
	}, displayCh)

Okay, works for me. So this is what we are looking at adding to the gateway.client package:

type Client interface {
	...
	NewContainer(ctx context.Context, req NewContainerRequest) (Container, error)
}

type NewContainerRequest struct {
	Mounts []Mount
	NetMode pb.NetMode
	SecurityMode pb.SecurityMode
}

type Mount struct {
	Selector string
	Dest string
	Ref Reference
	Readonly bool
	MountType pb.MountType
}

// Option A)
type Container interface {
	Run(context.Context, executor.ProcessInfo, started <-chan struct{}) error
}

// Option B)
type Container interface {
	Start(context.Context, executor.ProcessInfo) (ContainerProcess, error)
}

type ContainerProcess interface {
	Wait() error
	// B.1) Anything else useful here?
}

Looking at the two options for process management, I am leaning towards option B as it seems easier to use. Not sure if there are any useful abstractions to add into B.1, although without more it is conveniently implemented by errgroup.Group which I would likely just return from Container.Start

@coryb sgtm

Misssing release on container:

type Container interface {
	Start(context.Context, executor.ProcessInfo) (ContainerProcess, error)
        Release(context.Context)
}

With B we could add sending signals later. Could also do resize as a method there. Unless you want to reuse the type from executor package.

@tonistiigi Yeah, I was originally thinking to use the executor.ProcessInfo directly to deal with io and resize, but now I am leaning towards adding Resize to the ContainerProcess.

Last question (for now): Should I add these changes to #1627 along with tests? Or we can merge #1627 without tests (I think I need these gateway client updates to effectively test the server side) and I create a new PR for client + tests?