Friday, March 2, 2012

Running Jython apps using HBase

Jython is a great tool to have in your arsenal - currently using it for prototyping but it could do much more I'm sure. After yum installing jython onto a Centos 6 machine (not recommended as much newer versions exist), I used this great writeup to set up Jython and HBase:

$ export HBASE_HOME=/usr/bin/hbase
$ export JYTHON_HOME=/usr/bin/jython
$ export CLASSPATH=`$HBASE_HOME classpath`
$ alias pyhbase='HBASE_OPTS="-Dpython.path=$JYTHON_HOME" HBASE_CLASSPATH=/usr/share/java/jython.jar $HBASE_HOME org.python.util.jython'
$ pyhbase
Jython 2.2.1 on java1.6.0_30
>>> from org.apache.hadoop.hbase.client import HTable

Once the environment is set up, you can write an app. Let's say you have a table called 'test' with a column family named 'stuff' and column named 'value'.

import java.lang
from org.apache.hadoop.hbase import HBaseConfiguration, HTableDescriptor, HColumnDescriptor, HConstants
from org.apache.hadoop.hbase.client import HBaseAdmin, HTable, Get
from org.apache.hadoop.hbase.util import Bytes

conf = HBaseConfiguration()
admin = HBaseAdmin(conf)

tablename = "test"
table = HTable(conf, tablename)

row = 'some_row_key'
g = Get(Bytes.toBytes(row))
res = table.get(g)

val = res.getValue(Bytes.toBytes('stuff'), Bytes.toBytes('value'))

print Bytes.toString(val, 0, len(val))

The only weird behavior I couldn't explain is the need to overload that last Bytes.toString() call. I posted a question on StackOverflow but didn't get any great feedback. I'm thinking it's something in Jython maybe(?).

