Saturday, April 23, 2011

Rocking and rolling with Ektorp and Cloudant

My first Android app was developed using AWS' SimpleDB. While I like the Android SDK, I'm not terribly satisfied with the performance I'm getting. As far as NoSQL databases go, CouchDB is getting really great press. Cloudant is a well-respected YC-aligned startup which hosts CouchDB databases. I decided to give it a try, as I'd like to get deeper into JSON as well (generally, SimpleDB uses XML while CouchDB uses JSON).

Ektorp is one of several Java APIs for CouchDB but it doesn't look like many people (anyone as far as I can tell) are using it with Cloudant. So I decided to give it a try just for fun.

First off, Ektorp requires a lot of external JARs (9 or 10 from what I see). Importing the 1 org.ektorp JAR and 8-9 other JARs into an Eclipse project quickly proved tiresome. Ektorp recommends using Maven, which manages all dependencies. Better yet, m2eclipse offers a way to integrate Maven with Eclipse. The setup instructions and videos are very self-explanatory. Dropping the supplied Ektorp entry into my pom.xml was simple. All required dependencies were quickly and correctly imported. Piece of cake.

Okay, now to connect to Cloudant. Getting a free Cloudant account is easy. The database UI is clean and simple. I want to be a good coder and use SSL, which Cloudant can certainly support. Ektorp supplies a good API tutorial which accomplishes what I want to start off with:

HttpClient httpClient = new StdHttpClient.Builder()
.host("localhost")
.port(5984)
.build();

CouchDbInstance dbInstance = new StdCouchDbInstance(httpClient);
CouchDbConnector db = new StdCouchDbConnector("mydatabase", dbInstance);

db.createDatabaseIfNotExists();


Initial attempts proved unsuccessful as I couldn't figure out what .host() to use. The supplied documentation wasn't clicking - on to StackOverflow where I'm sure someone's dealt with something similar before. Not being able to add an "ektorp" tag to my question leads me to believe the library isn't as prevalent as I thought. No worries, someone was able to kindly point out I shouldn't be including "http://" or "https://" in my host variable. Oh yeah. Read my StackOverflow question for the details. So I was able to get it working using http as follows:

HttpClient httpClient = new StdHttpClient.Builder()
.host("[username].cloudant.com")
.port(5984)
.username("[username]")
.password("[password]")
.build();


Getting SSL to play nice proved difficult. Using port 443 didn't readily work until someone from Cloudant wisely told me to RTFM. Okay, adding .enableSSL(true) and .relaxedSSLSettings(true) is a good point but I was still getting an IllegalStateException and the stack trace indicated it was trying to use http after initializing my object with https. Huh? Finally, after scouring the Ektorp discussion board, I found this post which indicated Ektorp 1.1.1 included SSL-related bug fixes. So I updated my pom.xml to look for 1.1.1 vice 1.1.0. Recompile Maven, run Java application, success! Database added to my cloudant account via https.

I was getting an annoying but non-fatal exception relating to SLF4J:

Failed to load class "org.slf4j.impl.StaticLoggerBinder"

This is fairly common. Manually adding slf4j-simple-X.jar (slf4j-simple-1.6.1.jar to be exact) to the project resolved the issue.

So this is my super-complex program:

HttpClient httpClient = new StdHttpClient.Builder()
.host("[username].cloudant.com")
.port(443)
.username("[username]")
.password("[password]")
.enableSSL(true)
.relaxedSSLSettings(true)
.build();

CouchDbInstance dbInstance = new StdCouchDbInstance(httpClient);
CouchDbConnector db = new StdCouchDbConnector("httpsdatabase", dbInstance);

db.createDatabaseIfNotExists();


and remember to make ektorp's dependency look for version "1.1.1" in your pom.xml.

Friday, April 22, 2011

Ext4 delayed allocation data loss

My Ubuntu (Maverick Meerkat) netbook crashed this week and completely wiped an open .odt document. The behavior was very similiar to what happened here. I'm not a file system guru so I wasn't aware until now of the long standing issue with Ext4 file system delayed allocation. From LinuxInsight:
In Ext4 file system, Delayed Allocation causes some extra risks of the data loss if your system gets crashed before all the data is written to hard drive.

This was also covered on slashdot over 2 years ago.

It appears that the transition to Ext4 requires applications properly use fsync() to prevent data loss. Theodore T'so, who played a major role in developing Ext4, includes a good summary here. In fact, Android devices which run on Ext4 are rolling out and preventing data loss has been a major concern.

I can't imagine the Open Office that ships with Ubuntu 10.10 doesn't adhere to the Ext4 spec and use fsync() properly but I've never had this problem in other apps. Very aggravating and perhaps another reason to use Google Docs or give Libre Office a try!