I recently found myself in a situation where mission-critical software was suffering from performance problems due to relying on a remote API which was both slow (as slow as 11sec / transaction), and unreliable. In this case, it turned out that there were multiple applications accessing this API, and every individual application was affected. There were some details of the remote API that were notable:
- It is accessible via RESTful URLs
- It returns cacheable results
- It is generally used in a read-only mode
So, a reasonable solution appeared to be to write a caching proxy. The result is restcache.
Details of the solution are very simple. I implemented a Java Servlet which proxies GET and POST requests to a foreign server. GET requests which do not have query parameters are intercepted and cached using Apache JCS. JCS is a very mature piece of software which can implement multi-layer caches, and includes features such as disk based caches, relational caches and even in-memory caches.
For restcache, it was likely that there would be more that one foreign API which I wanted to proxy-cache, so I implemented cache pools. A cache pool is simply a dedicated JCS region which caches responses from a specific HTTP URL.
Finally, real-world caches, on real-world sites, need to be maintained, monitored and occasionally cleared by system admins. restcache exposes a great deal of data using JMX, and supports clearing pools via JMX. Any reasonable system administration tool which can communicate with servers over JMX could monitor restcache, or an admin could simply use JConsole.
Configuration of caches in restcache is simple:
- Configure the JCS regions, like this. The example which matches the XML below is here.
- Configure the restcache pools in XML. The schema is here.
Here's an example pool declaration which caches RSS from CBC. The JCS region it uses is called "cbc", and is configured in cache.ccf.
<rcpool> <name>cbc</name> <region>cbc</region> <target>rss.cbc.ca</target> </rcpool>
So, any cacheable GET request to "rss.cbc.ca" will be stored in the JCS region "cbc". So this, for example, would be cached.
Interestingly, JCS supports lateral caching, so if you really need that, it's available.
Finally, there is an additional potential application; reducing the costs incurred by accessing per-transaction APIs. Some APIs charge a fee every time the API is accessed. If those APIs are accessed from a public-facing site, there quickly becomes an issue of cost control. restcache could be inserted between the costly API and the public facing site with the intention of returning cached results rather than accessing the API for every page render.