mit-pdos / xv6-riscv

Xv6 for RISC-V

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does `user/sh.c` violate the strict aliasing rules?

yuxqiu opened this issue · comments

The C standard Section 6.5 defines the strict aliasing rule as follows.

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  1. a type compatible with the effective type of the object,
  2. a qualified version of a type compatible with the effective type of the object,
  3. a type that is the signed or unsigned type corresponding to the effective type of the object,
  4. a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  5. an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  6. a character type.

In user/sh.c, all pointers to different types of commands are converted to struct cmd*:

xv6-riscv/user/sh.c

Lines 196 to 221 in f5b93ef

struct cmd*
execcmd(void)
{
struct execcmd *cmd;
cmd = malloc(sizeof(*cmd));
memset(cmd, 0, sizeof(*cmd));
cmd->type = EXEC;
return (struct cmd*)cmd;
}
struct cmd*
redircmd(struct cmd *subcmd, char *file, char *efile, int mode, int fd)
{
struct redircmd *cmd;
cmd = malloc(sizeof(*cmd));
memset(cmd, 0, sizeof(*cmd));
cmd->type = REDIR;
cmd->cmd = subcmd;
cmd->file = file;
cmd->efile = efile;
cmd->mode = mode;
cmd->fd = fd;
return (struct cmd*)cmd;
}

The returned pointer is then used to access the type field:

xv6-riscv/user/sh.c

Lines 71 to 130 in f5b93ef

switch(cmd->type){
default:
panic("runcmd");
case EXEC:
ecmd = (struct execcmd*)cmd;
if(ecmd->argv[0] == 0)
exit(1);
exec(ecmd->argv[0], ecmd->argv);
fprintf(2, "exec %s failed\n", ecmd->argv[0]);
break;
case REDIR:
rcmd = (struct redircmd*)cmd;
close(rcmd->fd);
if(open(rcmd->file, rcmd->mode) < 0){
fprintf(2, "open %s failed\n", rcmd->file);
exit(1);
}
runcmd(rcmd->cmd);
break;
case LIST:
lcmd = (struct listcmd*)cmd;
if(fork1() == 0)
runcmd(lcmd->left);
wait(0);
runcmd(lcmd->right);
break;
case PIPE:
pcmd = (struct pipecmd*)cmd;
if(pipe(p) < 0)
panic("pipe");
if(fork1() == 0){
close(1);
dup(p[1]);
close(p[0]);
close(p[1]);
runcmd(pcmd->left);
}
if(fork1() == 0){
close(0);
dup(p[0]);
close(p[0]);
close(p[1]);
runcmd(pcmd->right);
}
close(p[0]);
close(p[1]);
wait(0);
wait(0);
break;
case BACK:
bcmd = (struct backcmd*)cmd;
if(fork1() == 0)
runcmd(bcmd->cmd);
break;
}

At this point, it seems to me that this access (cmd->type) violates the strict aliasing rules, since struct cmd* and struct execcmd* (for example) cannot be aliased:

  1. struct cmd is not compatible with struct execcmd.
  2. Same as above.
  3. struct cmd is not a signed/unsigned type.
  4. Same as above.
  5. struct execcmd does not contain any members whose types are compatible with struct cmd.
  6. struct cmd is not a character type.

IMHO, the correct thing to do is to replace int type; with struct cmd type;. By doing so, we can then alias struct cmd* and struct execcmd*. CPython does the same (PEP 3123).

  • In addition, this gives us more reassurance about pointer conversions (from struct execcmd* to struct cmd*). According to C standard 6.7.2.1, "A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning."

(Although this project is written in ANSI C, the C version is not specified as a compiler flag. Therefore, I assume it should follow the C17 standard as the version of gcc installed on my machine is 12.2.0, which uses -std=gnu17.)