Question: figuring out if an iform will run in 64bit mode userspace

Question

Question: figuring out if an iform will run in 64bit mode userspace

jxors opened this issue 4 years ago · comments

Hi!

I'm trying to enumerate all iforms that are supported on XED_MACHINE_MODE_LONG_64 with XED_ADDRESS_WIDTH_64b when running in an unprivileged context. To do this, I need to filter instructions like AAA which are UD in 64bit mode, and instructions like LGDT which require ring 0.

Is there any way to do this with XED? I see that there is an XED_ATTRIBUTE_RING0, but I can't seem to find a way to get the attributes on an iform. The decoder also gives an error if I try to decode an unsupported instruction, so presumably that information is also available somewhere in the library:

Attempting to decode: 37 
Could not decode given input.

Thanks!

Mark Charney · Answer 1 · Sat Jan 30 2021 04:02:34 GMT+0800 (China Standard Time)

Interesting question. The PATTERN field for instructions not valid in 64b mode either has "not64" or "mode16" or "mode32" in it. Things with "eamode16" are not capable of executing in 64b mode as 64b mode cannot do 16b addressing. One could use read_xed_db.py to look at this using the contenst of obj/dgen from a prior build. Instructions have many different forms and the IFORMs were not designed to make the mode distinction apparent.

I guess I should ask why you are doing what you are attempting to do? Overall goal-wise.

Jos · Answer 2 · Mon Feb 01 2021 23:10:31 GMT+0800 (China Standard Time)

Looking at the datafiles I see now that iforms are not the abstraction I am looking for. I am trying to determine the completeness of an instruction fuzzer/enumerator. That is, given a set of bit patterns (let's say a string of 1, 0 or ? where ? matches either a 1 or a 0) I am trying to determine which valid userspace x64 instructions are not matched by any pattern.

Looking at the python scripts, I see generator.py is used to generate the instruction decoder. I could hook into that by executing everything up to gen_everything_else(), and then flattening the computed graph into a list without any non-terminals. Does that sound like a reasonable approach?

Mark Charney · Answer 3 · Tue Feb 02 2021 00:17:18 GMT+0800 (China Standard Time)

I do not understand what you are trying to do. I suspect there are far too many bit patterns when you take into account memory addressing and the resulting differing instruction lengths.

FWIW, the test generator for the xed enc2 encoder generates all the legal instructions. Possibly you could go at it from that angle. I do not know anything about your fuzzer though.

If you do coding, I would suggest building off of the pysrc/read_xed_db.py, pysrc/gen_setup.py and the simple example pysrc/gen_dump.py). They works off the intermediate build products from obj/dgen (typically). I do not recommend people modify the other scripts.