A while back, I noted that Google had announced that they would be supporting Java on their AppEngine cloud computing platform. I finally got around to working on a significant AppEngine for Java project (something beyond “hello world” or the demo “Guestbook” app).
Working with my friend and colleague, and serious Java guru, Mark Petrovic we decided that a good “Goldilocks” candidate, that was neither too big, nor too trivial, was the experimental service Twitmart.org, a classifieds marketplace mashup, using Twitter APIs.
We decided that “porting” an existing web-based application rather than inventing a new one made more sense because this way we really didn’t have to think a lot about the design or functional specifications. We already had them. We just had to think about how to implement to those specifications on a Java-based web application platform, and specifically, the Google AppEngine for Java platform.
With the Twitmart application, the first thing to address was that the site uses Restful-style urls (as opposed to a fully Restful architecture, for you Rest weenies). This introduces a number of issues. The AppEngine for Java platform implements the venerable, but now decade-old, Java Servlet API. Servlets don’t naturally support the clean urls that are essential in modern web applications.
One way to go about this is to do it the old fashioned hard-coded way and manually parse urls and forward to servlets for the action. We considered doing so, as a last resort, but it would be painful and potentially a maintenance nightmare.
We had hoped to use Jersey but found that it was not quite there yet in terms of compatibility with Google AppAngine, although it looks like there is progress and we will see at least a subset of Jersey supported on AppEngine at some point. For this project, we decided that the outstanding issues with Jersey on AppEngine were more than we wanted to deal with. Since Twitmart is a web site rather than a web service API, perhaps Jersey isn’t quite the right platform anyway.
Regardless, we went with Restlet, which does support AppEngine in their latest unstable releases. Going with Restlet meant that we also needed a compatible template engine to replace JSP. We went with Freemarker which is supported as a Restlet extension.
There are many Java libraries for accessing Twitter APIs. We went with Twitter4J which appears to be the most AppEngine friendly (and it’s a nice, clean API too).
With all of these building blocks in place, the port of Twitmart to AppEngine for Java was mostly a matter of grinding it out. Here’s a bunch of stuff we learned in the process:
Restlet, Freemarker, and Twitter4J work on AppEngine
The Twitmart application doesn’t exercise every part of any of these tools, but the basics clearly work. That’s nice to know. We will use these again on future projects.
Servlets and Restlets can co-exist
A few of the operations that Twitmart does were better suited to traditional servlets. With some effort, we probably could have made them work as Restlets, but it was much quicker and easier to make them servlets, so that’s what we did. And it works. Servlet urls are directed to servlet classes in the usual way in the
web.xml config while everything else is passed through the Restlet adapter/router class.
The AppEngine Text class overcomes the 500-character limit of Strings in the datastore
The datastore will only accept Strings of up to 500 characters. To store larger text, one must use blobs. AppEngine provides a Text class that makes this pretty easy – but it does make your application more AppEngine specific. A Text object cannot be viewed in the datastore viewer (admin panel) and they cannot be indexed or queried.
It’s not that hard to put arbitrary Java objects into the datastore
The Twitmart application accepts images uploaded by users to display with classified ad postings. Since AppEngine has no writeable file system, these “image files” must be placed into (and retrieved from) the datastore. This turns out to be pretty easy, as long as the size of the data conforms to AppEngine’s limits – e.g. 1MB per datastore entity (see http://code.google.com/appengine/docs/java/datastore/overview.html)
Objects in the datastore have unexpecetd dependencies
I don’t know if Java bytecode ends up in the datastore or not, but it kind of feels like it does. When we rearranged some classes, we discovered that datastore records using those classes failed with
ClassNotFound errors. I guess we should have expected this, but it’s something to keep in mind – once something is written to the datastore, never move the class.
As I’ve noted before, the AppEngine datastore is not an RDBMS, despite providing a query capability that might make you think otherwise at first glance. The AppEngine datastore is based on the BigTable storage system. BigTable follows a very different philosophy than traditional RDBMS and, as a result, imposes many important restrictions. Knowing this from prior work on Taglets.org and other projects on AppEngine for Python, I knew about these restrictions and, in fact, the Twitmart application doesn’t make any queries at all; it always retrieves objects (records) by key and never via query.
Another lesson from our experience with the Python AppEngine was “be prepared to move off AppEngine” (for any number of reasons). This means trying to structure your application such that moving to a standard platform will not require a complete re-write. One place to start is to avoid any
com.google.appengine.* imports. In the case of this Twitmart app, the only place we do that is for the special
com.google.appengine.api.datastore.Text class (in order to store text strings larger than 500 characters). This is isolated to a couple of Java classes in this application, but in a larger app, or if such things found their way into too many places, it would be best to write abstraction libraries of some kind to make it easier to separate out AppEngine dependencies later.
The Twitmart app, as written, should run on any standard Java web application platform, such as Jetty, pretty easily. However, that said, while using JDO as the data management interface works, in theory, as a way to make your application work on AppEngine and on a standard platform, it might not turn out to be the best way, or the most common technique for managing data on standard web application platforms, in practice. In many cases, you would probably want to add an abstraction layer around your data models to make it easier to swap in different low-level APIs. I think in general, the AppEngine for Java platform makes this easier and more natural than the AppEngine for Python platform, where it takes a lot more effort to keep your app standard-platform ready.
The Twitmart port is not done. There are few functions left to implement and a few things still to fix in terms of better exception handling and related cleanup and enhancement, but the primary functionality is operational. In practice, even with light usage, it is slower on AppEngine than it was on my own server, but I guess that’s the price you pay for “free” – so much for scaleable. I guess the theory is, on AppEngine it will show similarly poor performance for a dozen users as it does for 100,000 users.
If I develop more AppEngine apps, I’m going to use the Java AppEngine for sure over the Python AppEngine. Unless there is some really compelling reason (like a specific library I want to use) I don’t see me writing another App Engine app in Python ever again.
While I still don’t think Google AppEngine is ready for anything too important yet, if it ever will be, having a Java version is a big step in expanding the possible uses of Google’s cloud computing platform, IMHO, and I will almost certainly deploy more experimental applications like Twitmart using AppEngine for Java in the future.
- Google App Engine for Java
- Restlet edition for Google App Engine
- Freemarker extension for Restlet
UPDATE October 16, 2009: The performance on App Engine was so poor that we had to move the Twitmart.org site off of App Engine and back to one of our own servers. Details here.