Saturday, April 23, 2011

Rocking and rolling with Ektorp and Cloudant

My first Android app was developed using AWS' SimpleDB. While I like the Android SDK, I'm not terribly satisfied with the performance I'm getting. As far as NoSQL databases go, CouchDB is getting really great press. Cloudant is a well-respected YC-aligned startup which hosts CouchDB databases. I decided to give it a try, as I'd like to get deeper into JSON as well (generally, SimpleDB uses XML while CouchDB uses JSON).

Ektorp is one of several Java APIs for CouchDB but it doesn't look like many people (anyone as far as I can tell) are using it with Cloudant. So I decided to give it a try just for fun.

First off, Ektorp requires a lot of external JARs (9 or 10 from what I see). Importing the 1 org.ektorp JAR and 8-9 other JARs into an Eclipse project quickly proved tiresome. Ektorp recommends using Maven, which manages all dependencies. Better yet, m2eclipse offers a way to integrate Maven with Eclipse. The setup instructions and videos are very self-explanatory. Dropping the supplied Ektorp entry into my pom.xml was simple. All required dependencies were quickly and correctly imported. Piece of cake.

Okay, now to connect to Cloudant. Getting a free Cloudant account is easy. The database UI is clean and simple. I want to be a good coder and use SSL, which Cloudant can certainly support. Ektorp supplies a good API tutorial which accomplishes what I want to start off with:

HttpClient httpClient = new StdHttpClient.Builder()
.host("localhost")
.port(5984)
.build();

CouchDbInstance dbInstance = new StdCouchDbInstance(httpClient);
CouchDbConnector db = new StdCouchDbConnector("mydatabase", dbInstance);

db.createDatabaseIfNotExists();


Initial attempts proved unsuccessful as I couldn't figure out what .host() to use. The supplied documentation wasn't clicking - on to StackOverflow where I'm sure someone's dealt with something similar before. Not being able to add an "ektorp" tag to my question leads me to believe the library isn't as prevalent as I thought. No worries, someone was able to kindly point out I shouldn't be including "http://" or "https://" in my host variable. Oh yeah. Read my StackOverflow question for the details. So I was able to get it working using http as follows:

HttpClient httpClient = new StdHttpClient.Builder()
.host("[username].cloudant.com")
.port(5984)
.username("[username]")
.password("[password]")
.build();


Getting SSL to play nice proved difficult. Using port 443 didn't readily work until someone from Cloudant wisely told me to RTFM. Okay, adding .enableSSL(true) and .relaxedSSLSettings(true) is a good point but I was still getting an IllegalStateException and the stack trace indicated it was trying to use http after initializing my object with https. Huh? Finally, after scouring the Ektorp discussion board, I found this post which indicated Ektorp 1.1.1 included SSL-related bug fixes. So I updated my pom.xml to look for 1.1.1 vice 1.1.0. Recompile Maven, run Java application, success! Database added to my cloudant account via https.

I was getting an annoying but non-fatal exception relating to SLF4J:

Failed to load class "org.slf4j.impl.StaticLoggerBinder"

This is fairly common. Manually adding slf4j-simple-X.jar (slf4j-simple-1.6.1.jar to be exact) to the project resolved the issue.

So this is my super-complex program:

HttpClient httpClient = new StdHttpClient.Builder()
.host("[username].cloudant.com")
.port(443)
.username("[username]")
.password("[password]")
.enableSSL(true)
.relaxedSSLSettings(true)
.build();

CouchDbInstance dbInstance = new StdCouchDbInstance(httpClient);
CouchDbConnector db = new StdCouchDbConnector("httpsdatabase", dbInstance);

db.createDatabaseIfNotExists();


and remember to make ektorp's dependency look for version "1.1.1" in your pom.xml.

Friday, April 22, 2011

Ext4 delayed allocation data loss

My Ubuntu (Maverick Meerkat) netbook crashed this week and completely wiped an open .odt document. The behavior was very similiar to what happened here. I'm not a file system guru so I wasn't aware until now of the long standing issue with Ext4 file system delayed allocation. From LinuxInsight:
In Ext4 file system, Delayed Allocation causes some extra risks of the data loss if your system gets crashed before all the data is written to hard drive.

This was also covered on slashdot over 2 years ago.

It appears that the transition to Ext4 requires applications properly use fsync() to prevent data loss. Theodore T'so, who played a major role in developing Ext4, includes a good summary here. In fact, Android devices which run on Ext4 are rolling out and preventing data loss has been a major concern.

I can't imagine the Open Office that ships with Ubuntu 10.10 doesn't adhere to the Ext4 spec and use fsync() properly but I've never had this problem in other apps. Very aggravating and perhaps another reason to use Google Docs or give Libre Office a try!

Monday, March 7, 2011

Stop powering through code

Much like a sleep pattern, most people work in a fairly predictable cycle in which they need a little warm-up time, get into a groove in which productivity steadily rises then falls, and finally they burn out. This is usually over the period of a few hours or much more.

Not me. All night hackathons for school or work were fun in my twenties but I've found I'm now most effective in shorter "bursts" of coding. I think I've always been this way but it's definitely manifesting itself more that I'm in my thirties.

Back in my startup days, someone set up a web cam over the foosball table. People could log in from their desks. If someone was practicing (yes, some people were really serious about learning the angles and working on 'footwork'), others would inevitably meander over and start a game. These kinds of environments really helped me hit a sweet spot between creativity and burn out.

I can't sit in front of a monitor for 4 hours like some people and write a Java library or bang out some shell scripts. I need breaks or else the quality of my code really dips. Currently, my day job doesn't entail programming so I'm doing it at night and weekends. I also have a toddler at home so any opportunity to lock myself away for 3 hours straight and code is out the window. That's fine and here's why:


  1. I write better code by coding less. It seems the longer someone sits at their computer trying to solve a problem, the more likely they are to write shitty code. This can be because they don't see the big picture or just want to get through their current objective. Great article today in Hacker News detailing this hazard. During frequent breaks, I find myself debating my next move or talking myself into/out of a course of action.

  2. I get bored. 2 hours is about all I can take before I need to stand up and do something else.

  3. I don't need that long to ramp back up. Since I've probably already thought of what I need to do, I'm usually able to get into "the zone" and reach high productivity within a few minutes.

Thursday, February 24, 2011

Review of AWS SDK for Android (SimpleDB)

So development of my Android app, PayNanny, continues and I'm at the point where I can make some observations about the AWS SDK for Android (Beta release). Note that of the 4 AWS services (S3, SimpleDB, SNS, and SQS), I'm only using the SimpleDB API. I won't bore you with how easy it is to make a free AWS account but believe me, it's easy.

First off, I'm so happy I don't have to write an HTTP library or any of the framework necessary for code on a mobile device to interact with a cloud database Isn't is awesome that this is all that's required to add an entry to a SimpleDB domain (similar to a table) is:


List attributes = new ArrayList(5);
attributes.add(new ReplaceableAttribute().withName("date").withValue(date));
// more lines of attributes.add

PutAttributesRequest request = new PutAttributesRequest("timelog", UUID.randomUUID().toString(), attributes);

AmazonSimpleDB mDB = new AmazonSimpleDBClient(credentials);
mDB.putAttributes(request);


Everything's taken care of (well, except transaction handling as SimpleDB is NoSQL) and it keeps your code compact and easy to read. Most of my utilization of the AWS SDK follows in the same vein and I heavily reference the AmazonSimpleDBClient class. I kept the database structure very flat to use SimpleDB the way it's supposed to be used. For instance, I've combined some things which would usually be different fields in an RDBMS into 1 attribute. Once I get this value from the database, I parse it according to the schema I designed.

Like any database-centric app, you're going to loop around your search results often so you'll take your List, pull out each Item, and look at the Attributes like:


// itemList is a List
for(Item item : itemList) {
    List attributeList = item.getAttributes();
    String itemName = item.getName();

    // parse attribute list and sort the data
    for(Attribute a : attributeList) {


It can get a little monotonous and having a bunch of nested loops always makes me nervous but what are you going to do.

SimpleDB is typeless - everything gets stored as a string. I find this to be good and bad. I liked not being so limited in what I had to send to the database but it makes writing optimized queries more difficult. Instead of writing an RDBMS SQL statement saying "get me all data between the dates of 1/2/11 and 1/8/11", I had to code a SimpleDB statement saying "get me all data with the date 1/2/11 or 1/3/11 or 1/4/11 or 1/5/11 or 1/6/11 or 1/7/11 or 1/8/11". So in effect, SimpleDB is pushing a bunch of logic that would usually be handled by the database server into the application code. Mobile devices are pretty fast but I'd rather some server farm in Oregon or wherever do this work than my Droid. Zero padding and offsetting numbers allows for some of this RDBMS functionality but I didn't utilize that in my code.

One major feature lacking in the API is access to the AWS Identify and Access Management (IAM), which is a vital requirement for those who wish to deploy Android apps without giving away your keys. In the meantime, check out the awskeyserver project, which does a good job providing this funtionality via Google App Engine.

I haven't made up my mind about the NoSQL concept. While I won't have to deal with the data consistency issues inherent in this design, I appreciate that the strong consistency option is built into SelectRequests. I don't think NoSQL databases wouldn work for, say, financial applications that need to keep track of a transaction but I don't see why it wouldn't be fine for a social app that needs to scale. Whenever I write stuff like that, I'm reminded of Ted Dziuba's great anti-NoSQL piece which includes the words "You Are Not Google".

Overall, I'm satisfied with this API and the performance I'm getting (for free, mind you). Once my app goes Beta, I hope to have more insights.

Saturday, February 19, 2011

Native Client - where web apps and native apps meet

I was motivated by a web apps presentation by Seth Ladd, a Google Chrome developer advocate, to write a short white paper why my organization should be moving in this direction. But it's hard to get around the fact that native apps still beat the pants off of web apps in many functional areas. Until this week, I wasn't aware Google was working on Native Client, a way to run native compiled code directly in the browser. This certainly makes sense if Chrome is to become a major OS.

I don't think I'm brave enough to fool around with the developer release SDK right now - I'm busy working on my Android app - but once it goes beta it'll be worth a look.

Tuesday, February 8, 2011

PayNanny

PayNanny is a household employer payroll app I'm working on for Android devices. So far, the only functionality I've built consists of a way for the user to "log in" (identification only - no authentication yet), create/delete/edit employee names, and link or delink their account with another.

I've open sourced it on github.

Saturday, January 29, 2011

NoSQL databases - now with SQL!

So I'm writing an app and wanted to take the opportunity to see what the NoSQL buzz was all about. By no means would I need to scale to the level most programmers expect when they employ NoSQL (like Netflix) but I think I've worked with RDBMS enough (though it's been awhile). Ted Dziuba has a pretty funny critique of the NoSQL craze here.

Amazon Web Services (AWS) SimpleDB was a perfect fit - not only is it free for small fish like me but the new AWS Android SDK includes SimpleDB support. I'm not interested in writing a bunch of HTTP libraries - the AWS Android API does everything for me.

The big knock against NoSQL is its data consistency. You're not always guaranteed to get the data you're expecting. SimpleDB counters with a Consistent Read option which I'm employing. You give up a little speed but for a small app like mine it's a no-brainer. But there are no transactions to track so there's still some risk.

Another surprise was SimpleDB's support of SQl through SelectRequest. You can't do stuff like JOINs (these kinds of "advanced" operations are handled in application code) but it's convenient when you need to pull a targeted set of data.

Special shout out to the folk(s) who made sdbtool, a Firefox plug in that let's you interact with your SimpleDB account.

Here are a couple snippets from my class which handles SimpleDB interaction. Connecting is as easy as:

BasicAWSCredentials credentials;
Properties properties = new Properties();
try {
properties.load(getClass().getResourceAsStream(AWS_PROPERTIES));

String accessKeyId = properties.getProperty("accessKey");
String secretKey = properties.getProperty("secretKey");

// some boring error checking

credentials = new BasicAWSCredentials( properties.getProperty( "accessKey" ), properties.getProperty( "secretKey" ) );
// note mDB is an AmazonSimpleDBClient
mDB = new AmazonSimpleDBClient(credentials);
}

Executing queries and putting results in a List is fairly easy:

SelectRequest selectRequest = new SelectRequest("select * from accounts where m_username = '" + username + "'").withConsistentRead(true);
SelectResult selectResult = mDB.select(selectRequest);
List resultList = selectResult.getItems();

I'm able to really take advantage of the API simplicity when adding stuff to the database. NoSQL architectures seem to be pretty great for these types of actions (as long as the data gets there!).

List attributes = new ArrayList(1);
attributes.add(new ReplaceableAttribute().withName("m_username").withValue( username));
PutAttributesRequest request = new PutAttributesRequest("accounts", username, attributes);
mDB.putAttributes(request);

You know, if data consistency is critical, you could keep querying the database until your data is positively there...