JavaScript:Verifier
Nanojit Verifier
Tracking bug: bug 463137
It would be nice if TraceMonkey developers could spend less of their time tracking down code generation errors. A verification pass that ran after LIR generation could catch some kinds of incorrect code immediately. If such a pass ran after native code generation, it could catch bugs in the assembler as well. If the pass used a type system that was sound, it would catch all segmentation faults due to JITted code, in addition to other kinds of bugs. The verifier would be an optional pass, perhaps enabled by default in debugging builds, that would disassemble machine code into LIR, analyze the LIR to infer the types of the values being operated on, and use those types to validate memory reference, arithmetic operations, and so on. The results here would be:
- a type system for validation, whose types describe what the validator can know about values. These types would include what JavaScript calls types, shapes, and perhaps internally used values like shape numbers.
- type rules: for each primitive operation (add; subtract; fetch), a function that looks at the types of the operands and computes the type of the result --- or complains if the operation is ill-typed.
- An "abstract interpreter" for LIR that starts with the known types of the inputs and then walks the code applying the type rules for the operations in the code.
Stages
- Start with a trivial type system, perhaps with types "tagged value", "integer", "pointer", "float". Ensure that we never ld the result of a float-valued LIR, etc. This allows us to get the interpreter working on the LIR vocabulary.
- Elaborate the type system to know about objects, shapes, pointers to floats, dense arrays, etc.
- Disassemble machine code to LIR, to bring the assembly pass under scrutiny as well.
The type system would necessarily be specific to the layout of the objects it describes. For example, in TraceMonkey, passing a guard that checks the shape of an object would refine the validator's understanding of type of that object and permit subsequent member access. So different clients of nanojit would need different type systems. This could mean that the abstract interpreter and disassembler should be part of nanojit, and the type system should be provided by nanojit's client. Or, the type system could be part of nanojit and provide all the types all nanojit's clients need. Or perhaps the whole verifier would be TraceMonkey-specific.