org.apache.hadoop.mapreduce.lib.db
Class DBOutputFormat<K extends DBWritable,V>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.OutputFormat<K,V>
      extended by org.apache.hadoop.mapreduce.lib.db.DBOutputFormat<K,V>

@InterfaceAudience.Public
@InterfaceStability.Stable
public class DBOutputFormat<K extends DBWritable,V>
extends OutputFormat<K,V>

A OutputFormat that sends the reduce output to a SQL table.

DBOutputFormat accepts <key,value> pairs, where key has a type extending DBWritable. Returned DBOutputFormat.DBRecordWriter writes only the key to the database with a batch SQL query.


Nested Class Summary
 class DBOutputFormat.DBRecordWriter
          A RecordWriter that writes the reduce output to a SQL table
 
Constructor Summary
DBOutputFormat()
           
 
Method Summary
 void checkOutputSpecs(JobContext context)
          Check for validity of the output-specification for the job.
 String constructQuery(String table, String[] fieldNames)
          Constructs the query used as the prepared statement to insert data.
 OutputCommitter getOutputCommitter(TaskAttemptContext context)
          Get the output committer for this output format.
 RecordWriter<K,V> getRecordWriter(TaskAttemptContext context)
          Get the RecordWriter for the given task.
static void setOutput(Job job, String tableName, int fieldCount)
          Initializes the reduce-part of the job with the appropriate output settings
static void setOutput(Job job, String tableName, String... fieldNames)
          Initializes the reduce-part of the job with the appropriate output settings
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DBOutputFormat

public DBOutputFormat()
Method Detail

checkOutputSpecs

public void checkOutputSpecs(JobContext context)
                      throws IOException,
                             InterruptedException
Description copied from class: OutputFormat
Check for validity of the output-specification for the job.

This is to validate the output specification for the job when it is a job is submitted. Typically checks that it does not already exist, throwing an exception when it already exists, so that output is not overwritten.

Specified by:
checkOutputSpecs in class OutputFormat<K extends DBWritable,V>
Parameters:
context - information about the job
Throws:
IOException - when output should not be attempted
InterruptedException

getOutputCommitter

public OutputCommitter getOutputCommitter(TaskAttemptContext context)
                                   throws IOException,
                                          InterruptedException
Description copied from class: OutputFormat
Get the output committer for this output format. This is responsible for ensuring the output is committed correctly.

Specified by:
getOutputCommitter in class OutputFormat<K extends DBWritable,V>
Parameters:
context - the task context
Returns:
an output committer
Throws:
IOException
InterruptedException

constructQuery

public String constructQuery(String table,
                             String[] fieldNames)
Constructs the query used as the prepared statement to insert data.

Parameters:
table - the table to insert into
fieldNames - the fields to insert into. If field names are unknown, supply an array of nulls.

getRecordWriter

public RecordWriter<K,V> getRecordWriter(TaskAttemptContext context)
                                                     throws IOException
Get the RecordWriter for the given task.

Specified by:
getRecordWriter in class OutputFormat<K extends DBWritable,V>
Parameters:
context - the information about the current task.
Returns:
a RecordWriter to write the output for the job.
Throws:
IOException

setOutput

public static void setOutput(Job job,
                             String tableName,
                             String... fieldNames)
                      throws IOException
Initializes the reduce-part of the job with the appropriate output settings

Parameters:
job - The job
tableName - The table to insert data into
fieldNames - The field names in the table.
Throws:
IOException

setOutput

public static void setOutput(Job job,
                             String tableName,
                             int fieldCount)
                      throws IOException
Initializes the reduce-part of the job with the appropriate output settings

Parameters:
job - The job
tableName - The table to insert data into
fieldCount - the number of fields in the table.
Throws:
IOException


Copyright © 2009 The Apache Software Foundation