intelxed / xed

The X86 Encoder Decoder (XED), is a software library for encoding and decoding X86 (IA32 and Intel64) instructions

Home Page:https://intelxed.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question: What is the meaning of the datafile OPERANDS content in xed?

lancejpollard opened this issue · comments

Specifically, I am looking at these, and here are some examples:

MEM0:rw:b
IMM0:r:b:i8
MEM0:w:mem32int
REG0=XED_REG_ST0:r:IMPL:f80
REG1=XED_REG_X87POP:r:SUPP
REG0=XED_REG_ST0:rw:IMPL:f80
REG0=GPR8_B():rw
REG0=GPR8_B():r
REG1=MASK1():r:mskw:TXT=ZEROSTR
REG2=ZMM_N3():r:zf32:MULTISOURCE4
REG1=XED_REG_X87STATUS:w:SUPP

Searching the codebase for things like the "functions" like GPR8_B(), I get led to the *-tables.txt files, such as:

xed_reg_enum_t GPR8_B()::
REXB=0 RM=0x0  | OUTREG=XED_REG_AL 
REXB=0 RM=0x1  | OUTREG=XED_REG_CL
REXB=0 RM=0x2  | OUTREG=XED_REG_DL
REXB=0 RM=0x3  | OUTREG=XED_REG_BL

From here what I read this as (taking REG0=GPR8_B():r as an example), the left is the machine code, the right is the output register (with the XED_REG_ prefix). So if we have AL, generate machine code with REXB=0 RM=0x0. If we have machine code with REXB=0 RM=0x0, generate reg AL.

Same with MASK1(), and the other functions:

xed_reg_enum_t MASK1()::
MASK=0x0  | OUTREG=XED_REG_K0
MASK=0x1  | OUTREG=XED_REG_K1
MASK=0x2  | OUTREG=XED_REG_K2
MASK=0x3  | OUTREG=XED_REG_K3
MASK=0x4  | OUTREG=XED_REG_K4
MASK=0x5  | OUTREG=XED_REG_K5
MASK=0x6  | OUTREG=XED_REG_K6
MASK=0x7  | OUTREG=XED_REG_K7

Then SUPP is "supplementary", IMPL is "implicit", etc. (I got this from the python code).

Then r and w is for read/write.

The things like mem32int are the operand types.

Things like XED_REG_ST0 are specific registers.

xed_reg_enum_t X87()::
RM=0x0  | OUTREG=XED_REG_ST0
RM=0x1  | OUTREG=XED_REG_ST1
RM=0x2  | OUTREG=XED_REG_ST2
RM=0x3  | OUTREG=XED_REG_ST3
RM=0x4  | OUTREG=XED_REG_ST4
RM=0x5  | OUTREG=XED_REG_ST5
RM=0x6  | OUTREG=XED_REG_ST6
RM=0x7  | OUTREG=XED_REG_ST7

The only thing I can't figure out is:

  • What f80 means.
  • What MULTISOURCE4 means.
  • What TXT=ZEROSTR means.

What do those mean?

Also I noticed the XED_REG_ERROR, does that just mean it does not exist?

xed_reg_enum_t DR_R()::
# ...
REXR=1 REG=0x7  | OUTREG=XED_REG_ERROR

Oh, and what does REG0=, REG1=, etc. mean? Some OPERANDS such as MEM0:rw:b don't have an equals sign, so not sure how to take this.

Lastly, are there any docs explaining the datafiles/*-isa.txt content meaning? Or how is the best way to determine this?

Thank you.