This compact utility library is an annotation based object mapper for HBase (written in Java) that helps you:
- convert your bean-like objects to HBase rows and vice-versa (for use in Map/Reduce jobs on HBase tables and their unit-tests)
- define data access objects for entities that map to HBase rows (for random single/range/bulk access of rows of an HBase table)
Let's say you've an HBase table citizens
with row-key format country_code#UID
. Let's say your table is created with two column families main
and optional
which may have columns like uid
, name
, salary
etc.
This library enables to you represent your HBase row as a class like below:
@HBTable("citizens")
public class Citizen implements HBRecord {
@HBRowKey
private String countryCode;
@HBRowKey
private Integer uid;
@HBColumn(family = "main", column = "name")
private String name;
@HBColumn(family = "optional", column = "age")
private Short age;
@HBColumn(family = "optional", column = "salary")
private Integer sal;
@HBColumn(family = "optional", column = "flags")
private Map<String, Integer> extraFlags;
@HBColumn(family = "optional", column = "dependents")
private Dependents dependents;
@HBColumnMultiVersion(family = "optional", column = "phone_number")
private NavigableMap<Long, Integer> phoneNumber; // Multi-versioned column. This annotation enables you to fetch multiple versions of column values
public String composeRowKey() {
return String.format("%s#%d", countryCode, uid);
}
public void parseRowKey(String rowKey) {
String[] pieces = rowKey.split("#");
this.countryCode = pieces[0];
this.uid = Integer.parseInt(pieces[1]);
}
}
(see Citizen.java for a detailed example with more data types)
Now, for above definition of your Citizen
class,
- you can use methods in
HBObjectMapper
class to convertCitizen
objects to HBase'sPut
andResult
objects and vice-versa - you can inherit from class
AbstractHBDAO
that contains methods likeget
(for random single/bulk/range access of rows),persist
(for writing rows) anddelete
(for deleting rows)
HBase's Result
object can be converted to your bean-like object using below method:
<T extends HBRecord> T readValue(ImmutableBytesWritable rowKey, Result result, Class<T> clazz)
For example:
Citizen e = hbObjectMapper.readValue(key, value, Citizen.class);
See file CitizenMapper.java for full sample code.
Your bean-like object can be converted to HBase's Put
(for row contents) and ImmutableBytesWritable
(for row key) using below methods:
ImmutableBytesWritable getRowKey(HBRecord obj)
Put writeValueAsPut(HBRecord obj)
For example, below code in reducer writes your object as one HBase row with appropriate column families and columns:
Citizen citizen = new Citizen(/*details*/);
context.write(hbObjectMapper.getRowKey(citizen), hbObjectMapper.writeValueAsPut(citizen));
See file CitizenReducer.java for full sample code.
Your bean-like object can be converted to HBase's Put
(for row contents) and ImmutableBytesWritable
(for row key) using below methods:
ImmutableBytesWritable getRowKey(HBRecord obj)
Result writeValueAsResult(HBRecord obj)
Below is an example of unit-test of a mapper using MRUnit:
Citizen citizen = new Citizen(/*params*/);
mapDriver
.withInput(
hbObjectMapper.getRowKey(citizen),
hbObjectMapper.writeValueAsResult(citizen)
)
.withOutput(Util.strToIbw("key"), new IntWritable(citizen.getAge()))
.runTest();
See file TestCitizenMapper.java for full sample code.
HBase's Put
object can be converted to your bean-like object using below method:
<T extends HBRecord> T readValue(ImmutableBytesWritable rowKeyBytes, Put put, Class<T> clazz)
Below is an example of unit-test of a reducer using MRUnit:
Pair<ImmutableBytesWritable, Writable> reducerResult = reducerDriver.withInput(Util.strToIbw("key"), Arrays.asList(new IntWritable(1), new IntWritable(5))).run().get(0);
Citizen citizen = hbObjectMapper.readValue(reducerResult.getFirst(), (Put) reducerResult.getSecond(), Citizen.class);
See file TestCitizenReducer.java for full sample code that unit-tests a reducer using MRUnit
Since we're dealing with HBase (and not an OLTP system), fitting an ORM paradigm may not make sense. Nevertheless, you can use this library as an HBase-ORM too!
This library provides an abstract class to define your own data access object. For example you can create a data access object for Citizen
class in the above example as follows:
import org.apache.hadoop.conf.Configuration;
public class CitizenDAO extends AbstractHBDAO<Citizen> {
public CitizenDAO(Configuration conf) throws IOException {
super(conf);
}
}
(see CitizenDAO.java)
Once defined, you can access, manipulate and persist a row of citizens
HBase table as below:
Configuration configuration = getConf(); // this is org.apache.hadoop.conf.Configuration
// Create a data access object:
CitizenDAO citizenDao = new CitizenDAO(configuration);
// Fetch an row from "citizens" HBase table with row key "IND#1":
Citizen pe = citizenDao.get("IND#1");
List<Citizen> lpe = citizenDao.get("IND#1", "IND#5"); //range get
Citizen[] ape = citizenDao.get(new String[] {"IND#1", "IND#2"}); //bulk get
pe.setPincode(560034); // change a field
citizenDao.persist(pe); // Save it back to HBase
citizenDao.delete(pe); // Delete a row by it's object reference
citizenDao.delete("IND#2"); // Delete a row by it's row key
(see TestsAbstractHBDAO.java for a more detailed example)
Add below entry within the dependencies
section of your pom.xml
:
<dependency>
<groupId>com.flipkart</groupId>
<artifactId>hbase-object-mapper</artifactId>
<version>1.2</version>
</dependency>
(See artifact details for com.flipkart:hbase-object-mapper:1.2 on Maven Central)
To build this project, follow below steps:
- Do a
git clone
of this repository - Checkout latest stable version
git checkout v1.2
- Execute
mvn clean install
from shell
Currently, this library depends on Hadoop and HBase from Cloudera version 4. If you're using a different version (or even different distribution like HortonWorks), change the versions in pom.xml to desired ones and do a mvn clean install
.
Please note: Test cases are very comprehensive - they even spin an in-memory HBase test cluster to run data access related test cases (near-realworld scenario). So, build times can sometimes be longer.
The change log can be found in the releases section.
If you intend to request a feature or report a bug, you may use Github Issues for hbase-object-mapper.
Copyright 2016 Flipkart Internet Pvt Ltd.
Licensed under the Apache License, version 2.0 (the "License"). You may not use this product or it's source code except in compliance with the License.