Monday, February 18, 2013

Assembling ordered Lists using HashMap and Collections.sort()

LinkedHashMap won't work if that's what you were thinking....

I was presented with a use case having to assemble an ordered List of objects out of tens/hundreds of thousands of ordered elements.  Here's the List<EventObject> I need to do something with:
public class EventObject {
private String key;
private String string1;
private SomeObject something1;
private SomeObject something2;
// expected constructor, getters, setters
}
view raw blog.021813 hosted with ❤ by GitHub

  • Keys are not unique.
  • The List<EventObject> needs to become a List<SummarizedObject>.
  • 1 or more EventObjects make up a SummarizedObject.
  • The order of List<EventObject> must be matched in the resulting List<SummarizedObject>.  Meaning, the first EventObject will definitely be in the first SummarizedObject and the last EventObject will definitely be in the last SummarizedObject.
  • Typically, for every 10k EventObject values 100 unique keys and 200 unique SummarizedObjects are created (give or take 1 order of magnitude).
  • I'm not concerned about concurrency at this time.
LinkedHashMap<String, List<SummarizedObject>> wouldn't work for me even though it maintains order because a single key could have several SummarizedObjects (hence the List of SummarizedObjects) whose order could interweave with values in different keys.  Bummer.

I kept coming back to the HashMap.  Being able to quickly determine a key's existence in the Map and (if it does exist) going straight to that key's List<SummarizedObject> was ideal.  So I dropped an order field in SummarizedObject, which looked like:
public class SummarizedObject {
private int order;
private SummaryObject summary1;
private SummaryObject summary2;
// expected constructor, getters, setters
}
view raw blog2.021813 hosted with ❤ by GitHub


The pseudo-code for getting the EventObjects into a HashMap<String, List<SummarizedObject>> is:
while i < event_list.size() {
event = event_list.get(1);
if summary_hash contains event.key {
get value (a list of EventObjects)
loop over list until appropriate SummaryObject found
if existing SummaryObject is found, update the list and put it back in map
otherwise, add a new SummaryObject to list and put it in map
} else {
put new key and list of single SummaryObject in map
}
view raw blog3.021813 hosted with ❤ by GitHub


So my last problem is to assemble all SummarizedObjects into a single, properly ordered List.  Enter Collections.sort(). Here's real code:
List<SummarizedObject> finalList = new ArrayList<SummarizedObject>();
for(List<SummarizedObject> summaries : summaryMap.values()) {
finalList.addAll(summaries);
}
Comparator comparator = new Comparator<SummarizedObject> {
public int compare(SummarizedObject obj1, SummarizedObject obj2) {
return obj1.getOrder() - obj2.getOrder();
}
}
Collections.sort(finalList, comparator);
view raw blog4.021813 hosted with ❤ by GitHub


Great performance and not that much code. And it can be optimized further by finding a better way to find the proper SummarizedObject in Step 5 of the above pseudo-code. Hope this helps someone!