GaloisInc / saw-script

The SAW scripting language.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SAW error message index

RyanGlScott opened this issue · comments

SAW's error messages are difficult to understand for beginners. This is due to a combination of factors, but some more notable hurdles include:

  1. The jargon used in the text of the error messages can be unfamiliar for newcomers.
  2. SAW errors often print out a lot of surrounding context, but the relevant information is often contained in a small part of the error message.

Experienced SAW users eventually learn how to interpret what SAW errors are trying to tell them, but this process can be challenging.

Rather than forcing new users to stare at errors for long periods of time until they internalize the rhyme and reason of SAW's error messages, we can greatly expedite things by offering our own guide to reading SAW error messages. In particular, I propose that:

  1. Every unique class of SAW error message should be accompanied by a unique error code (e.g., SAW-12345).
  2. SAW should include an error message index, hosted somewhere on the SAW website, such that a user can take an error message's code and look it up.
  3. Once the code's entry on the error message index is found, the page will provide a much more detailed description of what the error message is, common circumstances in which you'd encounter the error, and possible strategies for resolving the error.

This is an approach that both the GHC and rustc compilers have been using for a while, and anecdotal experience suggests that it is incredibly helpful for new users.

As a concrete example of how this might work, consider this program and (erroneous) SAW specification:

#include <stdint.h>

uint32_t f(uint32_t* x) {
  return *x;
}
let f_spec = do {
  x <- llvm_alloc (llvm_int 32);
  llvm_execute_func [x];
};

m <- llvm_load_module "test.bc";

llvm_verify m "f" [] false f_spec z3;

If you load this into SAW today, you will get this error:

[01:16:59.708] Loading file "/home/ryanscott/Documents/Hacking/C/test.saw"
[01:16:59.712] Verifying f ...
[01:16:59.712] Simulating f ...
[01:16:59.713] Stack trace:
"llvm_verify" (/home/ryanscott/Documents/Hacking/C/test.saw:8:1-8:12)
Symbolic execution failed.
Abort due to assertion failure:
  test.c:4:10: error: in f
  Error during memory load
Stack frame f
  No writes or allocations
Base memory
  Allocations:
    HeapAlloc 3 0x4:[64] Mutable 4-byte-aligned /home/ryanscott/Documents/Hacking/C/test.saw:2:8
    GlobalAlloc 2 0x0:[64] Immutable 1-byte-aligned [defined function ] f
    GlobalAlloc 1 0x0:[64] Immutable 1-byte-aligned [external function] llvm.dbg.declare

This error can be quite intimidating for new users, but the root cause is actually quite simple: llvm_alloc allocates memory but does not initialize it, and SAW's simulator crashes (with the Error during memory load part of the error) when it attempts to read from uninitialized memory. We could imagine hooking this error up to the error message index like so:

  1. Have the error print out an error code somewhere. Perhaps:

    <snip>
    Symbolic execution failed. [SAW-12345]
    <snip>
    

    We might also use terminal color highlighting to make this more obvious at a glance.

  2. In the error message index, add an entry for SAW-12345.

  3. Include a more detailed description of SAW-12345 (taken from the paragraph above) in the error message index. This would also suggest recommended fixes—in this example, the easiest fix would be to use llvm_points_to on the allocated memory to properly initialize it before invoking llvm_execute_func. The description would also say what the Stack frame and Base memory parts of the error message indicate (even though they may not be as important for deciphering the error message in this particular case).

We could imagine printing all of this information every time the error message prints, but this would likely be an excessive amount of text to print upon each SAW invocation.

One implementation challenge with this plan is that some of SAW's error messages come directly from the SAW source code, while others are indirectly printed from underlying libraries, such as Crucible. We may need to refactoring the code in the underyling libraries to make them suitable for error message indexing.