Java

phpGrammar

I’ve recently becoming interested in porting legacy PHP sites to JSPs.   It seemed to me that one of the hardest parts of this problem was parsing the PHP code.  Once a parse tree was created, the next step would be to emit equivalent JSP code. I went looking for an ANTL4 grammar for PHP, but could only find an ANTLR3 grammar, so I went to work updating the ANTLR3 grammar to ANTLR4 and writing a very simple validation suite.  The github project is here, and the resulting grammar is here.  

jvmBasic 2.0

I’ve always had a fascination with compilers.  As a Java geek, I’m also quite interested in the JVM.  In order to learn a little more about both, and as a way to contribute to the open source world, I decided to implement a compiler for BASIC.   So, jvmBasic consumes BASIC code and emits .class files. The first step was to build a parser and lexer for BASIC.  I decided to define an ANTLR4 grammar and use it to generate the lexer and parser.  BASIC is a fairly simple language, so the grammar was not difficult to define.  However, there are numerous BASIC dialects, so I had to pick a simple dialect.  jvmBASIC syntax looks much like Integer BASIC, but could easily be extended to parse GW-Basic, or maybe VB.  The resulting grammar is here. Once ANTLR has generated a parser and lexer, it’s possible to generate a parse tree for any BASIC input and then walk the tree emitting bytecode.  I used ASM to emit the bytecode.  An example BAS input file looks like this: 100 PRINT “Hello world” The generated parse tree from jvmBASIC debug output looks like – [1 line] –  [3 linenumber] –   [120 NUMBER] 100 –  [4 amprstmt] –   [5 statement] –    [7 printstmt1] –     [4 ‘PRINT’] PRINT –     [8 printlist] –      [66 expression] –       [60 func] –        [118 STRINGLITERAL] “Hello world” –  [122 CR] Because there is no concept of functions, methods or classes in BASIC, I chose to enclose the generated code in a single method, of a single class.  The classname is the name of the BASIC input file, and the single method is: public static void main(String[] args) The class has two fields: public InputStream inputStream; public OutputStream outputStream; The default values of inputStream and outputStream are System.in and System.out respectively.  However, in the case of jvmbasicwww, I replace them with HTTP input and output streams. BASIC doesn’t have new, delete, malloc, or free, or really any analogue of those.  Additionally, methods such as MID$ or perhaps VAL have certain semantics and behaviour.  In order to as closely as possible emulate BASIC, I implement jvmbasicrt.  Inside jvmbasicrt are implementation of each BASIC function, as well as a class called ExecutionContext. ExecutionContext includes the “guts” of a BASIC runtime: A stack.  Similar to many programming languages, BASIC needs a stack. All variables.  This is simple a hashtable of Values, keyed on the Variable name. Additionally there is Value which implements a variable with BASIC semantics. There is a maven mojo which wraps jvmbasicc.  The mojo jvmbasicmojo, compiles all BASIC files in “/src/main/bas” and produces a .class file for each one.  This mojo can be used to incorporate BASIC files into any normal maven project and then link them into a .jar file. An additional example BASIC file is: 10 REM this is a comment 20 PRINT “13” 30 PRINT “hi” 40 PRINT 10 50 PRINT 15.55 60 LET x = 12 70 PRINT “hihi” 80 PRINT x 90 LET y = 1+2 100 LET z = 3*6 110 LET d= y+z 120 PRINT d The maven pom file that uses jvmbasicmojo is here. The javap output for the generated .class file is: public class EXAMPLE1 { public com.khubla.jvmbasic.jvmbasicrt.ExecutionContext executionContext; public java.io.InputStream inputStream; public java.io.PrintStream outputStream; public EXAMPLE1(); public static void main(java.lang.String[]); public void program() throws java.lang.Exception; } There isn’t a big demand, that I’m aware of, for bytecode compilers for BASIC.  Two potential applications that come to mind are: Running VB code on the JVM.  Theoretically it would be possible to extend the grammar to include VB, and then to emit bytecode for VB programs.  This would form the foundation of technology to run .asp applications on the JVM.  The VB standard library would have to be implemented too. Cross-compilation.  Again, theoretically, it should be possible to use the grammar file to implement a cross compiler which consumes VB code and emits JSP code, or even PHP code.  

Embedding Jasper

Jasper is the JSP compiler inside Tomcat. For reasons, mainly of curiosity, I wanted to build a Pragmatach plugin which exposes Jasper.  Pragmatach supports some template engines such as FreeMarker, ThymeLeaf and Velocity, but I thought Jasper would be a good addition. I chose to use Tomcat 6, mainly because Tomcat 7 uses Servlet 3.0.  Pragmatach is currently on Servlet 2.5 .   Luckily, there is a really helpful example of compiling JSPs right inside Tomcat; the JspC shell.  JspC is a simple command-line executable which can consume jsp files and produce both .java files and .class files. The code I ended up with, is here.

Servlet mocking with Mockito

When writing custom servlets, it can be quite useful to unit test them.  In my case, I used a combination of testng and mockito.  The basic idea is simple; mock a HttpServletRequest, and a HttpServletResponse and pass them to the custom servlet.  There is a bit more too it, so here’s an example. final HttpServletRequest httpServletRequest = mock(HttpServletRequest.class); when(httpServletRequest.getPathInfo()).thenReturn(“/lineup/world.xml”); final HttpServletResponse httpServletResponse = mock(HttpServletResponse.class); final StubServletOutputStream servletOutputStream = new StubServletOutputStream(); when(httpServletResponse.getOutputStream()).thenReturn(servletOutputStream); final ServletConfig servletConfig = mock(ServletConfig.class); when(servletConfig.getInitParameter(“defaultPool”)).thenReturn(“testpool1”); This sets up an HttpServletRequest, an HttpServletResponse, a ServletConfig where i can pass in parameters that the container would have read from web.xml, and a “StubServletOutputStream”, which is just a convenient wrapper around a ByteArrayOutputStream. My StubServletOutputStream looks like this: public class StubServletOutputStream extends ServletOutputStream { public ByteArrayOutputStream baos = new ByteArrayOutputStream(); public void write(int i) throws IOException { baos.write(i); } } It’s necessary to init() the servlet, something the container would normally do: final MyServlet myServlet = new MyServlet(); myServlet.init(servletConfig); Finally; I invoke my servlet, and check the output: restCacheServlet.doGet(httpServletRequest, httpServletResponse); final byte[] data = servletOutputStream.baos.toByteArray(); Assert.assertNotNull(data); Assert.assertTrue(data.length > 0);

restcache

I recently found myself in a situation where mission-critical software was suffering from performance problems due to relying on a remote API which was both slow (as slow as 11sec / transaction), and unreliable.   In this case, it turned out that there were multiple applications accessing this API, and every individual application was affected.  There were some details of the remote API that were notable: It is accessible via RESTful URLs It returns cacheable results It is generally used in a read-only mode So, a reasonable solution appeared to be to write a caching proxy.  The result is restcache. Details of the solution are very simple.  I implemented a Java Servlet which proxies GET and POST requests to a foreign server.  GET requests which do not have query parameters are intercepted and cached using Apache JCS.  JCS is a very mature piece of software which can implement multi-layer caches, and includes features such as disk based caches, relational caches and even in-memory caches. For restcache, it was likely that there would be more that one foreign API which I wanted to proxy-cache, so I implemented cache pools. A cache pool is simply a dedicated JCS region which caches responses from a specific HTTP URL. Finally, real-world caches, on real-world sites, need to be maintained, monitored and occasionally cleared by system admins.  restcache exposes a great deal of data using JMX, and supports clearing pools via JMX.  Any reasonable system administration tool which can communicate with servers over JMX could monitor restcache, or an admin could simply use JConsole. Configuration of caches in restcache is simple: Configure the JCS regions, like this.   The example which matches the XML below is here. Configure the restcache pools in XML.  The schema is here. Here’s an example pool declaration which caches RSS from CBC.  The JCS region it uses is called “cbc”, and is configured in cache.ccf. <rcpool> <name>cbc</name> <region>cbc</region> <target>rss.cbc.ca</target> </rcpool> So, any cacheable GET request to “rss.cbc.ca” will be stored in the JCS region “cbc”.  So this, for example, would be cached. Interestingly, JCS supports lateral caching, so if you really need that, it’s available. Finally, there is an additional potential application; reducing the costs incurred by accessing per-transaction APIs.   Some APIs charge a fee every time the API is accessed.  If those APIs are accessed from a public-facing site, there quickly becomes an issue of cost control.  restcache could be inserted between the costly API and the public facing site with the intention of returning cached results rather than accessing the API for every page render.