Authors

Amritam Sarcar

Publication Date

1-2010

Comments

Technical Report: UTEP-CS-10-01

Abstract

The Java Modeling Language (JML) is a formal behavioral interface specification language for Java. It is used for detailed design documentation of Java program modules such as classes and interfaces. JML has been used extensively by many researchers across various projects and has a large and varied spectrum of tool support. It extends from runtime assertion checking (RAC) to theorem proving.

Amongst these tools, RAC and ESC/Java has been used as a common tool for many research projects. RAC for JML is a tool that checks at runtime for possible violations of any specifications. However, lately there has been a problem for tool support. The problem lies in their ability to keep up with new features being introduced by Java. The inability to support Java 5 features such as generics has been reducing the user base, feedback and the impact of JML usage. Also, the JML2 compiler (jmlc) has a very slow compilation speed. On average, it is about nine times slower than a Java compiler such as javac. It is well understood that jmlc does more work than a Java compiler, hence it would be slower. The jmlc tool uses a double-round strategy for generating RAC code. The performance can be improved by optimizing compilation passes, in particular, by abandoning the double-round compilation strategy.

In this thesis I propose a technique for optimizing compilation speed using a technique known as AST merging with potential performance gain than its predecessor. The jmlc tool parses the source files twice. In the first pass, it parses the source file whereas in the second the source file is parsed again along with the generated RAC code. This affects the performance of the JML compiler, as parsing is one of the most costly tasks in compilation. In this thesis, I show how I solved this problem. Also, discussed in this thesis is how the new JML compiler (jml4c) was built on the Eclipse platform. To reduce maintainability issues with regards to code base for current and future versions of Java, Eclipse was chosen as the base compiler. The code base to support new Java language constructs can now be then implicitly maintained by the Eclipse team, which is outside the JML community.

Most of Level 0 and Level 1 features of JML have been implemented. These are the features that are most commonly used in JML specifications. Almost 3500 JUnit test cases were newly created and run. Test cases and sample files from jmlc were incorporated to check completeness of the implementation. A subset of the DaCapo benchmark was used to test the correctness and performance of the new compiler. Almost 1.5 million lines of code was compiled across 350 packages generating about 5000 class files which shows that the compiler built can be used for practical and industrial purposes.

I observed that my proposed technique or the AST merging technique on an average is about about 1.6 times faster than the double-round strategy of jmlc;overall, jml4c was three times faster than jmlc. I also observed that this speedup increases with increase in lines of source code. As any industrial code or a sample code for which JML specification can be meaningfully used ranges in thousands of lines of code, our proposed technique will benefit from this. The implementation details showed the feasibility of the AST merging technique on the Eclipse platform which included front end support (lexing, parsing, type-checking) and back-end support (RAC generation and code generation). Several new features were also added that was either absent or partially implemented in jmlc. Some of them includes adding support for runtime specification inside inner classes, annotation for the enhanced for loop, support for labeled statements, enhanced error messages and others.

Share

COinS