lunixbochs / usercorn

dynamic binary analysis via platform emulation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Custom filesystem

felberj opened this issue · comments

commented

Hi,

I would like to provide a "custom" filesystem to the binary I am executing. I am thinking about something like this:

Create an empty filesystem and then add some files into it, like

testdata/lib64/libc.so.6 -> /lib/x86_64-linux-gnu/libc.so.6
my_file -> /tmp/file

One thing I could image is to hook into the syscalls and rewrite the paths there, or do you have a better way? I am willing to implement this on my own, but want some feedback on where to put this.

Btw, how unstable is the branch "unstable"?

unstable means it can fail tests temporarily and I will force push to it sometimes, but it's really the branch you need to target for development.

You should look into https://github.com/google/gvisor for virtual filesystem. I think it would be really useful to rebuild a lot of the posix/linux/darwin backend around gvisor + generated code, as they're way further along on linux kernel support than usercorn.

#104 is the previous VFS issue

Basically VFS requires virtual file descriptors: all file descriptor syscalls need to look the FD up in a map instead of using host FDs. We can cheat and use the syscall argument deserializer with the co.Fd type to do this transparently (I planned ahead on this, all fds are already co.Fd)

The map should be map[guest fd int]implementation, where implementation implements typedef VirtualFile interface{ Read() ReadAt() Write() etc }, then we implement stuff like stat, readat, close, etc on an abstract file and use the abstract class in syscalls instead of the current code. It's possible we can take all of the abstract implementations directly from gvisor. The open syscall then just needs to figure out what we're opening and stuff it into the file descriptor map.

You can come hang out in the slack if you want to ask questions directly: https://lunixbochs.herokuapp.com/

commented

I can see how running usercorn under gvisor would make it less complicated. Are you thinking about something like this:

  1. user executes $ usercorn example.out
  2. usercorn forks itself and runs a new process under gvisor where the binary is emulated
  3. the binary is communicating with gvisor, which is controlled by usercorn (e.g. to trace syscalls)

or something like that:

We look into how gvisor executes the syscalls and just use that interface?

What about cross plattform? We might loose windows and osx support...

commented

What I propose is that we create an interface System:

type System interface {
  func getgroups(ngid int, gid *_Gid_t) (n int, err error)
  func setgroups(ngid int, gid *_Gid_t) (err error)
  func wait4(pid int, wstatus *_C_int, options int, rusage *Rusage) (wpid int, err error)
  func accept(s int, rsa *RawSockaddrAny, addrlen *_Socklen) (fd int, err error)
  // ....
}

And we put that as field of the Usercorn struct. And then instead of using the syscall package directly, we use the that interface (which defaults to just proxying all those calls to the syscall package)

There's no point in running usercorn under gvisor. Both usercorn and gvisor implement a virtual kernel in Go. You can just port parts of the gvisor kernel directly into usercorn.

The interface System you propose is exactly like the kernel objects we already have.