Apache MRUnit is a library that allows you to unit-test MapReduce jobs. You can use it to test HBase jobs in the same way as other MapReduce jobs.
Given a MapReduce job that writes to an HBase table called MyTest
,
which has one column family called CF
, the reducer of such a job
could look like the following:
public class MyReducer extends TableReducer<Text, Text, ImmutableBytesWritable> { public static final byte[] CF = "CF".getBytes(); public static final byte[] QUALIFIER = "CQ-1".getBytes(); public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException { //bunch of processing to extract data to be inserted, in our case, lets say we are simply //appending all the records we receive from the mapper for this particular //key and insert one record into HBase StringBuffer data = new StringBuffer(); Put put = new Put(Bytes.toBytes(key.toString())); for (Text val : values) { data = data.append(val); } put.add(CF, QUALIFIER, Bytes.toBytes(data.toString())); //write to HBase context.write(new ImmutableBytesWritable(Bytes.toBytes(key.toString())), put); } }
To test this code, the first step is to add a dependency to MRUnit to your Maven POM file.
<dependency> <groupId>org.apache.mrunit</groupId> <artifactId>mrunit</artifactId> <version>1.0.0 </version> <scope>test</scope> </dependency>
Next, use the ReducerDriver provided by MRUnit, in your Reducer job.
public class MyReducerTest { ReduceDriver<Text, Text, ImmutableBytesWritable, Writable> reduceDriver; byte[] CF = "CF".getBytes(); byte[] QUALIFIER = "CQ-1".getBytes(); @Before public void setUp() { MyReducer reducer = new MyReducer(); reduceDriver = ReduceDriver.newReduceDriver(reducer); } @Test public void testHBaseInsert() throws IOException { String strKey = "RowKey-1", strValue = "DATA", strValue1 = "DATA1", strValue2 = "DATA2"; List<Text> list = new ArrayList<Text>(); list.add(new Text(strValue)); list.add(new Text(strValue1)); list.add(new Text(strValue2)); //since in our case all that the reducer is doing is appending the records that the mapper //sends it, we should get the following back String expectedOutput = strValue + strValue1 + strValue2; //Setup Input, mimic what mapper would have passed //to the reducer and run test reduceDriver.withInput(new Text(strKey), list); //run the reducer and get its output List<Pair<ImmutableBytesWritable, Writable>> result = reduceDriver.run(); //extract key from result and verify assertEquals(Bytes.toString(result.get(0).getFirst().get()), strKey); //extract value for CF/QUALIFIER and verify Put a = (Put)result.get(0).getSecond(); String c = Bytes.toString(a.get(CF, QUALIFIER).get(0).getValue()); assertEquals(expectedOutput,c ); } }
Your MRUnit test verifies that the output is as expected, the Put that is inserted into HBase has the correct value, and the ColumnFamily and ColumnQualifier have the correct values.
MRUnit includes a MapperDriver to test mapping jobs, and you can use MRUnit to test other operations, including reading from HBase, processing data, or writing to HDFS,